Please enable JavaScript.
Coggle requires JavaScript to display documents.
Modern Time Series Forecasting with Python Author: Manu Joseph - Coggle…
Modern Time Series Forecasting with Python
Author: Manu Joseph
TS-Ch1-DGP and Synthetic Timeseries
Types of time series
Irregular time series
may not happen in regular intervals of time
Regular time series
such as every hour or every month
Main areas of application for time series analysis
Time series forecasting:
predict the next day's temperature using the last 5 years of temperature data.
Time series classification
given a history of anormal or abnormal data, we need to predict whether the result is normal or abnormal.
Interpretation and causality
Understand the whats and whys of the time series based on the past values, understand the interrelationships among several related time series, or derive causal inference based on time series data
Data-generating process (DGP)
Generating synthetic time series
White and red noise
White noise
It has a sequence of random numbers with zero mean and constant standard deviation
Red noise
has zero mean and constant variance but is serially correlated in time
Signals
Cyclical or seasonal signals
Autoregressive signals
Mix and match
Stationary and non-stationary time series
Code
DGP by using TimeSynth in 5 steps:
Define a signal and noise generator
Define a time sampler (time range)
Define how many data points in time range
Define a time series by signal and noise generator
Sample signal from time serise
Terminology
Forecasting
Multivariate forecasting
Explanatory forecasting
Backtesting
In-sample and out-sample
Exogenous and endogenous variables
Forecast combination
TS-Ch2-Acquiring and Processing
Time Series Data
Load data
pandas datetime operations, indexing, and slicing
Converting the date columns into pd.Timestamp/DatetimeIndex
parse_dates=False when using read_csv
use year_first or day_first
when using pd.to_datetim, use yearfirst and dayfirst
Using the .dt accessor and datetime properties
e.g. Week of Year: {df.date.dt.weekofyear.iloc[0]},
Quarter: {df.date.dt.quarter.iloc[0]}
Handling missing data
Before start imputation :
The first consideration should be whether the missing data we are worried about is missing or not. Figure out how it happened.
Simple methods
Forward fill
backward fill
Fill na with specific word
Linear Interpolication
Nearest Interpolation
Spline, Polynomial, and Other Interpolations
Methods be used to fill in such
large missing gaps in data
Imputing with the previous day
Hourly average profile
The hourly average for each weekday
Seasonal interpolation
Types of features
Time Series Identifiers
Metadata or Static Features
Time-Varying Features
Formatting
of a dataset
Compact -> Can reduce usage memory,
can be used when visulaization
Expanded
Wide
Best practices
Check the end dates of all the time series.
Check row count before & after merge,
use one_to_one or many_to_one
Use pandas resample function
TS-Ch3-Analyzing and Visualizing
Time Series Data
Components of TS
Trend 趨勢
Seasonal 季節性 (regular)
Cyclical 週期性 ( irregular, e.g. economic recession)
Irregular (=residual or error term) 不規則
Two common ways:
Additive (Y = Trend + Seasonal + Cyclical + Irregular)
Multiplicative(Y = Trend
Seasonal
Cyclical * Irregular)
Visualizatoin
Line charts
Seasonal plots
Seasonal box plots
Calendar heatmaps
Autocorrelation plot
Decomposing a TS
Detrending
Two popular ways:
Moving averages
Locally estimated scatterplot smoothing (LOESS) regression
Deseasonalizing
Two popular ways:
Period adjusted averages
Fourier series
Implementations
seasonal_decompose from statsmodel
Seasonality and trend decomposition using LOESS (STL)
Fourier decomposition
Multiple seasonality decomposition using LOESS (MSTL)
Detecting and treating outliers
Detecting and treating outliers
Interquartile range (IQR)
Isolation Forest
Extreme studentized deviate (ESD) and seasonal ESD (S-ESD)