支持向量机

支持向量机基础

间隔与支持向量

定义

间隔(margin)

uploaded image

划分超平面

uploaded image

支持向量(support vector)

距离超平面最近的几个训练样本点

支持向量机(SVM)基本型

uploaded image

对偶问题

拉格朗日乘子法

uploaded image

得到对偶问题

uploaded image

KKT条件

SMO算法

  • 选取一堆需要更新的变量ai和aj
  • 固定ai和aj以外的参数,求解对偶问题获得更新后的ai和aj值

确定偏移项b

1. 对任意支持向量有 uploaded image

3. 使用所有支持向量求解的平均值 uploaded image

解出a后,求w与b即可得到模型

uploaded image

uploaded image

2.uploaded image

核函数

引入

问题

解决

3. 使得样本在新的空间线性可分

定义

uploaded image表示将 x 映射后的特征向量

核函数 uploaded image

2. 将样本从原始空间映射到更高维的特征空间

1. 原始样本空间不存在能正确划分两类样本的超平面

xi与xj在特征空间的内积等于它们在原始样本空间中通过 uploaded image计算的结果

定理

uploaded image 为对称函数,且核矩阵K 是半正定的

uploaded image 可以作为核函数

隐式定义一个“再生核希尔伯特空间”(RKHS)的特征空间

获取

常用核函数

高斯核、拉普拉斯核、Sigmoid核

线性核、多项式核

组合得到

线性组合

uploaded image

直积

uploaded image

对任意函数 g(x)

uploaded image

软间隔(soft margin)

允许某些样本不满足 uploaded image,最大化间隔的同时,不满足约束的样本尽量少

优化目标

uploaded image

l 为损失函数

uploaded image 为0/1损失函数

hinge损失函数下求解

替代损失(surrogate loss)

数学性质

凸函数

uploaded image的上界

uploaded image

常见的替代损失

hinge损失

uploaded image

指数损失(exponential loss)

uploaded image

对率损失(logistic loss)

uploaded image

软间隔支持向量机

引入松弛变量uploaded image

uploaded image

用以表征该样本不满足约束uploaded image的程度

求解

拉格朗日乘子法

KKT条件

软间隔支持向量机的最终模型仅与支持向量有关

通过采用hinge损失函数仍保持了稀疏性

正则化

uploaded image #

描述

“结构风险”(structural risk)

用于描述模型 f 的某些性质

uploaded image

经验风险(empirical risk)

uploaded image

用于描述模型和数据的契合程度

C 用于对二者的折中

被称为正则化常数

正则化项

支持向量回归

目标

给定训练样本 D = {(x1, y1), (x2, y2), ..., (xm, ym)}, yi ∈ R

学得一个形如uploaded image的回归模型,使得f(x)与y尽可能接近

w 和 b 是待确定的参数

容忍f(x)和y之间有 uploaded image的偏差

SVR问题形式化

uploaded image

C为正则化常数

uploaded imageuploaded image-不敏感损失函数

uploaded image

求解

1. 引入松弛变量 uploaded image

2. 拉格朗日乘子法 + KKT条件

解得

uploaded image

uploaded image

考虑特征映射形式

uploaded image

uploaded image

核方法 #

表示定理 #

内涵

优化问题

最优解

uploaded image

条件

任意单调递增函数uploaded image

任意非负损失函数 uploaded image

uploaded image

H是再生核希尔伯特空间,uploaded image 是H空间中关于h的范数

意义

对于一般的损失函数和正则化项,优化问题的最优解都可以表示成核函数uploaded image 的线性组合

核线性判别分析(KLDA)

映射

uploaded image将样本映射到一个特征空间 F

目标函数

uploaded image

学习目标

uploaded image

量定义

i 类样本在特征空间 F 的均值

散度矩阵

uploaded image

类间散度矩阵

uploaded image

类内散度矩阵

uploaded image

x的核函数 uploaded image

uploaded image

uploaded image

uploaded image

uploaded image

uploaded image

通过“核化”(即引入核函数)来将线性学习器拓展为非线性学习器

uploaded image