Please enable JavaScript.
Coggle requires JavaScript to display documents.
How to Minimize Vagueness In Language - Coggle Diagram
How to Minimize Vagueness In Language
Stop Word Elimination
Definition/Goal
ignoring terms that do not contribute to the ‘semantics’ of the
documents
Problems with this method
Creation and maintenance effort
term ‘computer’ could be a useful stop word for computer science journals
Approaches
managing a stop word list
elimination of high and low frequency terms
Stem and Base Reduction
Definition
General mapping of terms to the base or stem form
Types
Mapping to the base and stem form
Advantages
reduced number of terms to be managed
tracing to base and stem form is easier than query expansion
Disadvantages
specific search for word form becomes impossible ⇒ loss of precision
Query Extension
Definition
All word forms are preserved in the query
Advantages
information content of the query remains fully intact
Disadvantages
query can become extensive
⇒ performance losses
Base And Stem Form Reduction
Definition
Base: removal of inflection endings and mapping to existing
words
Example: applies ⇒ appl ⇒ apply
Stem : Traces words back to their stem. Stem does not have to be a word
Example: computer, compute, computation, computerization ⇒ comput
Terms According to Kuhlen
Basic formal form
Stem form
Basic lexicographic form
Methods For Base and Stem Reduction
Rules
Disctionaires
Truncation
Lovins Algorithm
Rules Based (English Language needs about 20 rules)
Exceptions for irregular verbs necessary
directions becomes direction becomes direct
Compound Word Identification
Terminological Control