Summarization

When data is available that does not focus directly on the right unit, it must be aggregated or summarized

The summarize feature could also be called group or group by

One or more columns by which to group data is selected, essentially creating a virtual "bucket" for each unique group

Typical functions that can be applied to data inside each bucket:

Count

Sum

Min

Max

First

Last

Average

Median

Mode

Standard deviation

A similar tool to summarize is the crosstab

Where summarize provides summary and aggregate information on existing columns, crosstab uses the content inside columns to create new columns

Crosstab is a way to deal with data that is currently in "skinny" form, and transform what is currently listed in rows to column-form

Skinny data is seldom at the right level for analysis

A skinny table is a common shape for data to take