Please enable JavaScript.

Coggle requires JavaScript to display documents.

Chapter 12: Data Reduction and Splitting (Filtering (Often there is a set…

- - - - Removal of full rows based on identical content of a few columns
        
        Data must first be sorted in the order listing the rows to keep first, followed by a specification of which columns should be identical for duplicated to be removed
        
        Leaves us with one record from each day at a time
        
        Summarize Function: each unique id can be grouped with each unique configuration of longitude and latitude into discrete buckets
        
        Often put most frequent data first
        
        Unique Function: Based on on ID row
      - very complex process
    - - Complete match removal
        
        No often needed because rows aren't usually identical
        
        Removal of full rows based on identical content in all columns