Please enable JavaScript.

Coggle requires JavaScript to display documents.

Chapter 10 Representing and Mining Text (Representation (First approach…

- - - - A collection of individuals words
      - Term Frequency
        
        Normalized: every word is lowercase
        
        Stemmed: suffixes removed
        
        Stopwords: remove the, and, of ..
        
        Measuring sparseness
        
        Not too rare
        
        Not overly common
        
        Imposing upper and lower limits on term frequency
        
        TFIDF: Term Frequency & Inverse Document Frequency
        
        Example: Jazz Musicans
        
        The relationship of IDF and Entropy
        
        .
- - - - Period: same day
      - Change or no change
      - Surge, Stable and Plunge
    - - The stream of news stories
      - A corresponding stream of daily stock prices