Please enable JavaScript.
Coggle requires JavaScript to display documents.
Text and Web Analytics Week 8 (Text Mining Applications (Marketing …
Text and Web Analytics
Week 8
Data Mining is Structured Data in Database
Text Mining : Unstructured data eg Word documents
Text Mining Applications
Marketing
Increase cross-selling and up-selling by analyzing call center data
Blogs, user reviews of products reveal user sentiments
Customer relationship management to increase overall lifetime value of customer
Security Applications
Spam filtering
Deception detection
Biomedical
DNA analysis, analysis of gene expression etc
Academic
Retrieval of information to answer specific queries
Text Mining Application Area
Information extraction:
identification of key phrases and relationships within text by looking for patterns
Topic tracking
: Based on a user profile and documents, text mining can predict other documents of interest to the user
Summarization
: to save time for the reader
Categorization
: identifying the main themes of a document and placing it into a predefined set of categories
Clustering
: group similar documents together
Concept linking
: connects related documents by identifying their shared concepts
Question answering
: finding best answer to a given question through knowledge-driven pattern matching
Natural Language Processing (NLP)
A very important component in text mining
A subfield of artificial intelligence and computational linguistics ;
Challenges of Text Mining (NLP)
Part-of-speech tagging: depends not only on definition of
term but also on the context used
Text segmentation : eg analysis of free-form text found in
e-mails and recorded telephone transcripts
Syntactic ambiguities eg grammar ambiguity
Text contains acronyms, abbreviations, misspellings. e.g. customer, cust, customar, csmr
Imperfect or irregular input eg foreign accents
Speech acts and semantic analysis: understanding the
meaning of words
Text Mining processes
Step 1: Establish the corpus
Collect all relevant unstructured data
Digitize, standardize the collection
Place the collection in a common place
Step 2: Create the Term-by-Document Matrix
Create the Term-by-Document Matrix (TDM)
Goal : to create TDM where the cells are filled with the
most appropriate indices
Step 3: Extract patterns/knowledge
Classification
Clustering
Association
Trend Analysis
Sentiment Analysis
gets data from full set of customer touch points
VOC is a key element of customer experience
management initiates
Voice of the market (VOM) : understanding
aggregate opinions and trends.
Sentiment Analysis Application
Voice of the customer (VOC)
Web Mining
Web is the largest repository of data
Challenges
The Web is too big for effective data mining
The Web is too complex
The Web is too dynamic
The Web is not specific to a domain
The Web has everything
Web mining is the process of discovering intrinsic
relationships from Web data
Used for competitive intelligence,
information/news/opinion collection, sentiment
analysis, and automated data collection
Web Structure Mining
The development of useful information from the
links included in the Web documents
Web Usage Mining (Web Analytics)
Extraction of information from clickstream analysis
of web server logs generated through Web page
visits and transactions
KDD for Web Mining
Step 1 : Business Understanding
Plausible goals for web-based mining:
Data Understanding and Data Preperation
Modeling
Evaluation
Deployment