Chapter 8: Accessing and Sorting Data

tracking down data

what is the problem?

what is the unit of analysis?

accessing data sourcing, moving to matrix

problems start w/ internal data

must remove data that isn't useful

use identity combinations to to join sources

external data requires APIs

removing columns that are not relevant

if in doubt, keep the data

write down identity columns

each box represents a table of data

customers

indiv. orders

order details

employees

Order ID

Customer ID

Employee ID

Order date

Shipped date

one-to-many relationship

must be able to connect date together through sources