Please enable JavaScript.

Coggle requires JavaScript to display documents.

Chapter 12: Data Reduction and Splitting (12.2 Filtering (Titanic dataset…

- - - - removal of full rows based on identical content of a few columns
        
        data first must sorted, listing rows to keep first
        
        follow by a specification of which columns should be identical for duplicated to be removed
      - Example
        
        home address via cell phone data
        
        make assumption then sort by date and time
        
        longitude/latitude of day's first use
    - - removal of full rows based on identical content in all columns
      - for a row to be removed, all values in all columns must match the same values in a prior row
- - - - filter can be placed to create a new set