Please enable JavaScript.
Coggle requires JavaScript to display documents.
Text and Web Analytics (Text Mining Application Area (Topic Tracking…
Text and Web Analytics
Data Mining VS Text Mining
Similarities
both seek for novel and useful patterns
both are semi-automated processes
Differences
nature of data
Data Mining: structured data in databases
Text Mining: unstructured data eg. word documents, PDF files
Text Mining Applications
Marketing
Security Applications
Biomedical
Academic
Text Mining Application Area
Information Extraction
identification of key phrases and relationships within text
Topic Tracking
based on a user profile and documents
predicts other documents of interest to user
Summarization
to save time for the reader
Categorization
identifying main themes of a document
Clustering
group similar documents together
Concept Linking
connects related documents by identifying their shared concepts
Question Answering
finding best answer to a given question
knowledge-driven pattern matching
Natural Language Processing (NLP)
What is NLP?
a subfield of artificial intelligence and computational linguistics
study of "understanding" the natural human language
consist of grammatical and semantic constraints as well as context
Challenges of NLP
Part-of-speech tagging
depends on defination of term and context used
Text Segmentation
eg. analysis of free-form text
Syntactic ambiguities
a sentence is being interpreted more than one way
Imperfect or Irregular Input
eg. foreign accents
Speech acts and semantic analysis
understanding the meaning of words