Please enable JavaScript.
Coggle requires JavaScript to display documents.
Data Integration (Joins continued (think of tables as matrices (two types…
Data Integration
Joins continued
think of tables as matrices
two types of joins
outer join
Left outer join
same as inner join but adds rows from the right table that aren't corresponding in the right table
could find why some customers have orders and some do not
Right outer join
produces the same result as inner joint but adds rows that aren't corresponding in the left table
could help find cart abandonment issues
Full outer join
produces the same as inner, outer left, and outer right joins
Inner join
rows in the left are combined with any row in the right with the same identity
makes a 3rd table with the repeated information
inner joins are most commonly used
when trying to find the most information possible use left outer joins
inner and left outer joins will take care of most needs
math is the best way to learn joins
see examples
Processes of data integration
Joins
combines datasets with shared values
creates one or two rows containing information that was combined
some rows will have repeat information
Unions
Assumes there are multiple columns in common between "A-C"
combines columns into a new table
check example
Unions
use unions when datasets have unique cases of similar values
use full outer join to connect lists that overlap between rows
if there is no overlap then use a union
Always use unions with training and test cases
avoid developing separate processes