Please enable JavaScript.
Coggle requires JavaScript to display documents.
Natural Language Processing - Coggle Diagram
Natural Language Processing
Language
Language is more than just communication, it is the primary method by which we do things together.
Language is the accumulation of shared meaning of common ground
Formal Language
Formal languages are language created by people for specific purposes.
Natural Language
Natural languages are the languages people speak. such as English. They evolved naturally
NLP is a set of computational techniques for analyzing and representing naturally occurring texts for the purpose of achieving human-like language processing for a range of applications.
understanding
Linguistics
Linguistics is the scientific study of language. It is focuses on formal, structural models of language and the discovery of language universals – the field of NLP was originally referred to as Computational Linguistics
Phonology
- the study of speech sounds in their cognitive aspects
Morphology
- the study of the formation of words
Phonetics
- he study of speech sounds in their physical aspects
Syntax
- the study of the formation of sentences
Semantics
- the study of meaning
Pragmatics
- the study of language use
Lexicon
- the study of everything about distinct words according to their position in the speech
Idioms
Non-standard English
Neologisms
Computer Science
concerned with developing internal representations of data and efficient processing of these structures
Cognitive Psychology
Cognitive psychology is the scientific study of the mind as an information processor. It has the goal of modeling the use of language in a psychologically plausible way
History
1950s - Turing “computer machinery and intelligence” test
1954 - Georgetown experiment - auto translation from Russian to English
1960s - Eliza - (Weizenbaum) psychotherapist - Restricted vocabulary set
1970s - Conceptual ontologies - the nature of being programs structure real-world information into computer-understandable data
Until 1980s - complex hand-written (programmed) rules
Challenges
Phonology
- speech recognition, part-of-speech tagging
Morphology
- segmentation, morphemes and words
Lexicon
- dictionary, word sense disambiguation, named entity recognition
Syntax
- generative grammar, co-ref resolution, parsing, auto-summary, xml, relationship extraction
Semantics
- meaning (all levels)
Discourse Analysis
- topic segmentation and recognition
Pragmatics
- common sense, world knowledge, machine translation (all) natural language generation, natural language understanding, question answering
Ocr
- optical character recognition
N-gram Character Model
N-gram can be defined as the contiguous sequence of n items from a given sample of text or speech. The items can be letters, words, or base pairs according to the application. The N-grams typically are collected from a text or speech corpus
Logical variations
Involves a variety of domains:
Time
A little space and physics
Knowledge
Perception
Naive psychology
Multi-agents
Web Links
http://www.bbc.co.uk/programmes/p032nvf4
http://hci.stanford.edu/~winograd/shrdlu/
http://www.bbc.co.uk/ontologies
http://conceptnet5.media.mit.edu/
http://web.media.mit.edu/~push/Kurzweil.html