Please enable JavaScript.

Coggle requires JavaScript to display documents.

Representing and Mining Data (Measuring Sparseness|Inverse Document…

- - - - Preprocessing is necessary
- - - - ignores all other structure/form/grammar
      - is the token present?
        
        1/0
  - - - iphone|IPHONE|iPhone
    - - announces|announced|announcing
    - - the|and|of|on
- - - - Kansa or famou
- - - - commonly paired words with _
        
        N-grams up to three
        
        singles, doubles, and triples
  - - - HP, H-P, Hewlett Packard
      - Knowledge insensitive
        
        must be learned or coded by hand
        
        Oakland Raiders
  - - - model sets of topics independently
      - Words map to one or more topics
        
        semantic indexing
        
        Probabilistic topic modeling