Counting 3,834 Big Data & Machine Learning Frameworks, Toolsets, and Examples...
Suggestion? Feedback? Tweet @stkim1

Last Commit
May. 16, 2019
Mar. 25, 2018


Hands-On NLTK Tutorial

The hands-on NLTK tutorial in the form of Jupyter notebooks

NLTK is one of the most popular Python packages for Natural Language Processing (NLP).

Index of Jupyter Notebooks

1.1 Downloading Libs and Testing That They Are Working
Getting ready to start!
1.2 Text Analysis Using nltk.text
Extracting interesting data from a given text
2.1 Deriving N-Grams from Text
Creating n-grams (for language classification)
2.2 Detecting Text Language by Counting Stop Words.ipynb
A simple way to find out what language a text is written in
2.3 Language Identifier Using Word Bigrams
State-of-the-art language classifier
3.1 Bigrams, Stemming and Lemmatizing
NLTK makes bigrams, stemming and lemmatization super-easy
3.2 Finding Unusual Words in Given Language
Which words do not belong with the rest of the text?
3.3 Creating a POS Tagger
Creating a Parts Of Speech tagger
3.4 Parts of Speech and Meaning
Exploring awesome features offered by WordNet
4.1 Name Gender Identifier
Building a classifier that guesses the gender of a name
4.2 Classifying News Documents into Categories
Building a classifier that guesses the category of a news item
5.1 Sentiment Analysis
Is a movie review positive or negative?
5.2 Sentiment Analysis with nltk.sentiment.SentimentAnalyzer and VADER tools
More sentiment analysis!
6.1 Twitter Stream (and Cleaning Tweets)
Live-stream tweets from Twitter
6.2 Twitter Search
Search through past tweets
7.1 NLTK with the Greek Script
Using NLTK with foreign scripts
8.1 The langdetect and langid Libraries
Useful libraries for language identification
8.2 Word2Vec (gensim)
Google's Word2vec


H. Z. Sababa — hb20007 —

Distributed under the MIT license. See LICENSE for more information.