Google Analytics
1⃣ Foundations
Introduction
Ask
Prepare
Process
Analyze
Share
Act
Profiles
Performance
Statistics
Analytics
Explore
Accurate
Skills
Curiosity
Understanding context
Technical mindset
Data design
Data strategy
Thinking
Visualization
Strategy
Problem Orientation
Correlation
Big Picture & Details
Gap analysis
Current
Target
Lifecycle
Planning
Capture
Manage
Analyse
Archive
Destroy
Tools
Spreadsheets
SQL
Select
From
Where
Visualization
2⃣ Ask
3⃣ Prepare
Bias
Conformity
Sampling
Interpretation
Observer
Good
Reliable
Origine
Comprehensive
Current
Cited
Type
Wide
Long
Multipe time points
Unique Time
Ethics
Ownerhsip
Transparency
Consent
Currency
Privacy
Openness
Open Data
data.gov
census.gov
opendatanetwork.com
cloud.google.com/solutions/datasets
datasetsearch.research.google.com
Databases
Keys
Primary
Foreign
Primary
Other
Table
Metadata
Descriptive
Structural
Administrative
ISBN
Technical
Source
Data governance
Data sources
Internal
External
who.int
kaggle
BigQuery
Sandbox
Free Trial
SQL Workspace
SELECT COUNT(*) AS title
SELECT DISTINCT start_station_name
ORDER BY
label_name DESC
SELECT SUM(number_of_strikes
Name Convention
Date
Version
4⃣ Process
No Data
Data Integrity
Definition
Accuracy
Replication ⚠
Transfer ⚠
Issues
Duplication
Not enough
Completness
Insufficient
Only 1 source
Keep updating
Outdated
Geographically limited 🌎
Sample Size
Bias
Random Sampling
Statistical Power 🔥
Not Chance
Over 80%
Significance
Consistency
Confidence Level
99%
Margin Error
Cleaning
Data Validation
Duplicates
Irrelevant
Misspelling
Typos
Ponctuations
Data Merging
Data Mapping
Compatibility
Concatenate
SQL
SELECT
FROM
WHERE
GROUP BY
ORDER BY
INSERT INTO tablename
(value1,value2,...)
UPDATE tablename
SET Field=new_value
WHERE
CREATE TABLE IF NOT EXIST
VALUES
(newvalue1,newvalue2,...)
Attributes
DISTINCT
LENGTH
SUBSTR
TRIM
CAST(field_name AS FLOAT64)
CONCAT
COALESCE()
Non Null
Verify ✔
Changelog
Accurate
Reliable
Business Problem ⁉
Goal 🥅
Data Origin
5⃣ Analyze
Organize
Format
Get inputs
Transform
Sorting
Filtering
SORT
Order By
Convert
Data Validation
Conditional Formating
CONCATENATE
LEN
RIGHT
LEFT
VLOOKUP
JOIN
Calculation
Pivot Table
6⃣ Visualize 👁
Data Visualization
Success
Information
Data
Story
Concept
Goal
Function
Visual Form
Metaphor
Attributes
Position
Size
Shape
Color
Graphs
Bar
Line
Pie
Maps
Tableau
Dynamic
Tableau Public
Story Telling 📖
Dashboard
Steps
Engage Audience
Create Compelling
Interesting Narrative
Primary Message
Spotlighting
Only
Needed
Important
Layout
Cohesive
Sharing
Data
Static
Live
Filter
7⃣ R Programming
Narrative
Tips
Characters
Setting
Plot
Big Reveal
Aha Moment
5 lines / 25 Words / Slide
Together
Business Taks
Business Metrics
Hypothesis
Data
Business Impacts
Packages
library(tidyverse)
Functions
print()
Objects 🎁
Variable <- Value
Vector
c(x,y,z,...)
Pipe
%>%
typeof()
length()
is.logical()
names()
List
list()
Diversity
1 type
str()
Lubridate
Date 📆
Time ⏰
today()
now()
ymd()
hms()
Dataframes
data.frame(x=c(x,y,...),y=c(...),z=c(...)...)
Files
file.create (“new_text_file.txt”)
file.copy (“new_text_file.txt” , “destination_folder”)
Matrix
1 Type
matrix(c(3:8), nrow = 2)
Clean
ggplot2
tidyr
readr
dplyr
data("ToothGrowth")
View(ToothGrowth)
filter(ToothGrowth,dose==0.5)
arrange(filtered_tg,len)
pipe_name <- data_set_name %>%
filter(dose==0.5) %>%
groupby(supp) %>%
arrange(len)
Dataframes
mutate()
str()
colsname()
read_csv()
Packages
here
skimr
janitor
dplyr
Functions
skim_without_charts
glimpse()
head()
select(column_name)
select(-column_name)
rename(column_name=new_name)
drop_na()
unite()
Visualize
ggplot2
Facets
library("ggplot2")
library("palmerpenguins")
ggplot(data=penguins)+geom_point(mapping=aes(x=flipper_length_mm,y=body_mass_g,color/shape/size=species))
ggplot(data=penguins)+geom_point(mapping=aes(x=flipper_length_mm,y=body_mass_g),color="purple")
geom
_point
_bar(...x...,color/size/alpha/fill)
_line
_smooth(...x,y...,line_type)
Facets
+facet_wrap(~species)
+facet_grid(sex~species)
Annotations
+labs(title="YYY",subtitle="ZZZ",caption="AAA",)
+annotate("text",x=x_value,y=y_value,label="XXX")
Save
Export
Plot
ggsave()
RMarkdown
install.packages'"Rmarkdown)
8⃣ Project
click to edit