8.1 Working with Text Data

Slides

References

Bag-of-Words model

What we covered in this video lecture

In this lecture, we are covering different ways to work with text data. This includes going from raw to preprocessed text and converting the preprocessed text into feature vectors for machine learning models.

What type of machine learning models can we use? There are classic models for tabular data like logistic regression and multilayer perceptrons. And then, there are sequence models like 1D convolutional networks and recurrent neural networks. Finally, and most importantly, there are large language transformers, which are now state-of-the-art when it comes to working with text.

Additional resources if you want to learn more

If you want to learn more about Tf-idf approach mentioned in this lecture, I made a walkthrough here: https://nbviewer.org/github/rasbt/pattern_classification/blob/master/machine_learning/scikit-learn/tfidf_scikit-learn.ipynb

Log in or create a free Lightning.ai account to access:

Quizzes
Completion badges
Progress tracking
Additional downloadable content
Additional AI education resources
Notifications when new units are released
Free cloud computing credits

8.1 Working with Text Data

Slides

References

Quiz: 8.1 Working with Text Data (Part 1)

Quiz: 8.1 Working with Text Data (Part 2)

Watch Video 1 Mark complete and go to Unit 8.2 →

Videos

Follow along in a Lightning Studio

DL Fundamentals 8: Large Language Models