Please enable JavaScript.
Coggle requires JavaScript to display documents.
BI week 8 (Text and web Analytics) (KDD for Web Mining (Business…
BI week 8 (Text and web Analytics)
Commons in data mining and text mining
seek for novel and useful patterns
semi-automated processes
Difference in data mining and text mining
Data mining: structured
Text mining: unstructured
Step 1: establish the corpus
Create term-by-document matrix
Extract patterns/knowledge
Text Mining applications
Marketing
Security applications
Biomedical
Academic
Areas
Information extraction
Topic tracking
Summarization
Categorization
Clustering
Concept Linking
Question answering
Natural Language Process
Structure collection of text
Old approach
New approach
Challenges
Part-of-speech tagging
Text segmentation
Syntactic ambiguities
Text contains acronyms
Imperfect or irregular input
Speech acts and semantic
Challenges of web mining
Too big for effective data mining
Too complex
Too dynamic
Not specific to a domain
Web has everything
3 types of web mining
Web content
Web structure
Web usage
KDD for Web Mining
Business understanding
Data understanding
Data Preparation
Modeling
Evaluation
Deployment