Linear Regression 线性回归

Linear regression is a linear model, e.g. a model that assumes a linear relationship between the input variables (x) and the single output variable (y). More specifically, that y can be calculated from a linear combination of the input variables (x).
线性回归就是要找一条直线，并且让这条直线尽可能地拟合图中的数据点，使得所有点到这条直线的误差最小。

2.1 What is Linear Regression?线性回归定义

2.2 Loss Function 损失函数

2.2.1 What is Loss Function? 定义

The loss function is a measure of how good a prediction model does in terms of being able to predict the expected outcome.
损失函数是衡量回归模型误差的函数。

2.2.2 Loss Function for Linear Regression

误差error = |该观测点Y值 - 预测点 Y’值|，数学表达如下：errori = |yi – yi’|

2.3 Lease Squares(最小二乘法)

定义

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems by minimizing the sum of the squares of the residuals made in the results of every single equation.
要想获得最接近真实的预测直线，可以通过最小化SSE(Sum of Squared Error)的方法

理解

Least Square (最小二乘法) 是从cost function的角度, 利用距离的定义建立目标函数

2.5 Gradient Descent

2.4 Maximum Likelihood Estimation 最大似然估计

定义

Maximum likelihood estimation (MLE) is a method of estimating the parameters of a statistical model given observations, by finding the parameter values that maximize the likelihood of making the observations given the parameters. 最大似然估计方法认为, 最合理的参数估计量( 𝞫0和𝞫1的值)应该使得”从分布中抽取该n组样本观测值的概率”最大.

理解

从概率角度的参数估计方法要对于总体的分布有一个假设(assumption).在(Ordinary) Linear Regression中, 默认给定的assumption是:p(y|x)是mean =𝝻= f(x), variance =𝞂^2 (一个与x无关的值) 的Normal distribution.（ p(y|x）表示的是已知某一个x的时候，对应的y的概率密度）

推导过程(课件)

然后再对数,再发现和Lease Squares的结论是一样的

最小二乘法LS与最大似然估计MLE直接的关系

总结：
当观测值满足一定条件時，最小平方估計和最大似然估计是相同的。

Gradient Descent

Stochastic Gradient Decent

mini-batch Gradient Descent

Least Square (最小二乘法) 是从cost function的角度, 利用距离的定义建立目标函数经典的参数估计方法是从概率的角度建立目标函数, 比如: Maximum Likelihood Estimation (最大似然估计).