Please enable JavaScript.
Coggle requires JavaScript to display documents.
Natural Language Processing - Coggle Diagram
Natural Language Processing
Introduction to NLP (Natural Language Processing)
Definition:
Interaction between computers and human languages using computational techniques to analyze and represent naturally occurring texts.
Purpose:
Achieving human-like language processing for various applications.
Formal vs Natural Language
Formal Language
Syntax and grammar are fixed.
Semantics (meaning) well-defined
Programming languages (Python, Java, C)
Natural Language
Ambiguity and dynamic changes.
Multiple meanings (e.g., "The chicken is ready to eat").
Human languages (English, Tamil, etc.)
Why NLP – Applications
Machine Translation:
(Translating "என் பெயர் ராம்" → "My name is Ram")
Text Summarization:
(Extracting keywords and summaries from articles.)
Information Extraction:
(Extracting meeting time and venue from an email.)
Context Analysis:
(Understanding trending topics on social media.)
Question Answering:
("What time is the next bus?" / "Which gene is associated with Diabetes?")
Sentiment Analysis:
(Analyzing product reviews and customer feedback.)
Main Components of NLP
Component
Natural Language Understanding (NLU)
Function
Understanding the meaning of text.
Component
Natural Language Generation (NLG)
Function
Generating natural language from structured data.
NLU Tasks
Morphological analysis
Syntactic analysis
Semantic analysis
Discourse analysis
NLG Tasks
Deep planning (what to say)
Syntactic generation (how to say it)
Surface realization (actual sentence formation)
Morphological and Lexical Analysis
Lexicon: Vocabulary of the language.
Morphology: Structure of words (prefix, root, suffix).
Lexical Analysis: Dividing text into paragraphs, sentences, words, and understanding their relationships.
Syntactic Analysis
Purpose:
Ensure proper grammatical structure.
Example:
"The girl go to the school" → Incorrect Syntax.
Semantic Analysis
Focus:
Extract literal meaning.
Example:
"Colorless blue idea" → Semantically nonsensical.
Discourse Integration
Contextual Meaning:
Meaning depends on previous and future sentences.
Example:
"She wanted it" → "It" depends on previous conversation.
Pragmatic Analysis
Real-world knowledge & social context.
Example:
"Close the window?" should be understood as a request, not just a statement.
Natural Language Generation (NLG)
Discourse Planning: What to say.
Surface Realization: How to express it in words.
Lexical Selection: Choosing the right words.
History of NLP
1970s Conceptual ontologies, chatterbots like Jabberwacky and ALICE.
1980s Shift to Machine Learning approaches due to computational power increase.
1960s ELIZA chatbot created (psychotherapist simulation).
1950s Turing Test proposed; Georgetown experiment on machine translation.
Ontologies and Semantic Web
Ontology:
Classifying and explaining entities (what exists).
Semantic Web:
Standardized way for machines to understand web page relationships.
General NLP Tasks
Tokenization (Segmenting text)
Stemming (Finding root forms)
Part of Speech (POS) tagging
Word Sense Disambiguation
Contextual Analysis
Sentiment Analysis
Segmentation Example
Breaking text into words or sentences.
Named Entity Recognition (NER
Identifying names, dates, places.):
Stemming
Purpose:
Reducing words to their root form.
Example:
"running", "ran" → "run".
POS Tagging
Assigning grammatical categories.
Example:
"Today is a beautiful day." → "Today (Noun)", "is (Verb)", etc.
Word Sense Disambiguation
Determining meaning of words based on context.
Example:: "bank" (money institution) vs "bank" (river edge).
Contextual and Sentiment Analysis
Extracting themes and emotional tone from texts.
Example: : Positive or negative restaurant reviews.
Techniques Used in NLP
Statistical : Machine Learning using large text corpora
Connectionist : Neural networks and pattern learning from examples.
Symbolic : Deep linguistic analysis, human-verified rules
Challenges in NLP
Speech recognition
Syntax parsing
Semantics understanding
Pragmatics and common sense reasoning
Word and sentence segmentation (especially in languages like Chinese)
Optical Character Recognition (OCR)