Please enable JavaScript.
Coggle requires JavaScript to display documents.
NLP - Coggle Diagram
NLP
Text Cleaning
Stemming
Using
multiple stemming libraries and test which one results in the most appropriate form of most/all the document
stemming algorithms
Porter
Lancaster
snowball
lemmatization
Stop word
Tokenization
Using
Regex
effective to find a match of words/phrases in a document
language and context-specific method
Remove Punctuation & noise
Using
Regex
Feature Extraction
N-grams
Bag of words