Please enable JavaScript.
Coggle requires JavaScript to display documents.
SQL (Aggregations (GROUP BY (GROUP BY can be used to aggregate data within…
SQL
Aggregations
count
Counts how many rows are in a particular column
Notice that COUNT does not consider rows that have NULL values
avg
Calculates the average of all values in a particular column
Ignores NULL completely
Can only be used on numerical columns
max
Returns the highest number, the latest date, or the non-numerical value closest alphabetically to Z
Ignores NULL values
min
Will return the lowest number, earliest date, or non-numerical value as early in the alphabet as possible.
Ignores NULL values
sum
Adds together all values in a column
Similar to count, but you can't use
, you have to sum over column name*
Can only be used on columns that have numerical values
Treats nulls as zero
INFO
All aggregations operates down columns, not across rows
NULL is ignored by aggregation (and means no data)
GROUP BY
GROUP BY can be used to aggregate data within subsets of the data. For example, grouping for different accounts, different regions, or different sales representatives.
Goes between there WHERE clause (if there is one) and the ORDER BY clause
You should always see any columns in the SELECT statement that are not being aggregated upon, in the group by statement
You can GROUP BY multiple columns at once
The order of column names in your GROUP BY clause doesn’t matter
GROUPing BY a date column is not usually very useful in SQL, as these columns tend to have transaction data down to a second.
DISTINCT
Always used in SELECT statements
Provides the unique rows for all columns written in the SELECT statement
It’s worth noting that using DISTINCT, particularly in aggregations, can slow your queries down quite a bit
HAVING
only useful when grouping by one or more columns
filter a query that has been aggregated
also commonly done using a
subquery
.
Logical operators
LIKE
This allows you to perform operations similar to using WHERE and =, but for cases when you might not know exactly what you are looking for
Requires the use of
wildcards
Used within a WHERE clause
IN
This allows you to perform operations similar to using WHERE and =, but for more than one condition
Allows you to filter data based on several possible values
Requires single quotes around non-numerical data
NOT
This is used with IN and LIKE to select all of the rows NOT LIKE or NOT IN a certain condition
AND & BETWEEN
These allow you to combine operations where all combined conditions must be true
Can be linked with other logical operators or aritmetic ones
Use between when you can
OR
This allow you to combine operations where at least one of the combined conditions must be true
This operator works with all of the operations we have seen so far including arithmetic operators (+,
, -, /), LIKE, IN, NOT, AND, and BETWEEN logic can all be linked together using the OR operator*
Conventions
Avoid spaces, use underscore
Capitalize commands, everything else lower case
use whitespace in queries for structure
use semicolon at the end of query;
Subqueries
known as inner queries, or nested queries
Whenever we need to use existing tables to create a new table that we then want to query again, this is an indication that we will need to use some sort of subquery
The database treats the inner query as an independent query and runs that before the outer query
Once the inner query has run, the outer query will run across the results table from the inner query
Statements
CREATE TABLE
DROP TABLE
SELECT
what columns to read from the FROM table
Derived column
use AS <new_name> to give the derived column a name
DISTINCT -
used to return only distinct (different) values
CASE
always goes in the SELECT clause
Must include WHEN, THEN, and END
ELSE and operators like OR, AND, LIKE, IN is optional
catch any of those division by zero values that were causing the error
WHEN = IF, THEN=THEN, END=CASE Statement must end with END
ELSE provides a way to capture values not specified in 'when' and 'then' statements
Aliases
Aliases are created with the FROM or JOIN clauses
Frequently used is just the first letter in a table name
Can also be used on columns, tho not as frequently as tables
Clauses
Not mandatory
LIMIT
ORDER BY
order our table by any row
comes after FROM but before LIMIT
DESC - flip the ordering
we can ORDER BY more than one column at a time. The statement sorts according to columns listed from left first and those listed on the right after that.
You can order by DESC for any column in your order by
WHERE
filter a set of results based on specific criteria
Symbols
Regular operators
Logical operators
comes after FROM but before ORDER BY or LIMIT
When working on non-numbers, use single quotes
Doesnt allow you to filter on aggregate columns -> Use HAVING instead in this situation
JOIN
Tells query an additional table from which you would like to pull data
*need
to specify the relationship between the two tables with the ON clause*
ON
Specifies a logical statement to combine the table in from and join statements
Should always occur with the FK being equal to the PK
Mandatory
FROM
what data to use
Functions on date
DATE_TRUNC
Sort by
second, day, month, year
SELECT TRUNC('day', date_column) AS day
DOW
day of week
SELECT TRUNC('dow', date_column) AS day
0 = sunday, etc
DATE_PART
Wildcards
%
represents
any
character
Arithmetic operators
(+, *, -, /)
Types
Table Expressions
Persistent Derived Tables