训练误差分析

adaboost的训练误差为:

ERR=分类错误的样本数总样本数=i=1NI(G(xi)yi)NERR = \frac{\text{分类错误的样本数}}{\text{总样本数}} = \frac{\sum_{i=1}^NI(G(x_i) \neq y_i)}{N}

adaboost算法最终分类器的误差界为:

ERR=1Ni=1NI(G(xi)yi)11Ni=1Nexp(yif(xi))2=mZm3=m=1M(2em(1em))4=m=1M(214γm2)5exp(2m=1Mγm2)6\begin{aligned} ERR = \frac{1}{N}\sum_{i=1}^NI(G(x_i) \neq y_i) &&{1} \le \frac{1}{N}\sum_{i=1}^N\exp(-y_if(x_i)) &&{2} = \prod_mZ_m &&{3} = \prod_{m=1}^M(2\sqrt {e_m(1-e_m)}) &&{4} = \prod_{m=1}^M(2\sqrt {1-4\gamma_m^2}) &&{5} \le \exp(-2\sum_{m=1}^M\gamma_m^2) &&{6} \end{aligned}

说明: (1):ERR的定义 (2):

I(G(xi)yi)=I(G(xi)yi)+0I(G(x_i) \neq y_i) = I(G(x_i) \neq y_i) + 0

G(xi)yiG(x_i) \neq y_i时,

yif(xi)<0exp(yif(xi))>1=I(G(xi)yi)y_if(x_i)\lt 0 \Rightarrow \exp(-y_if(x_i))\gt 1 = I(G(x_i) \neq y_i)

G(xi)=yiG(x_i) = y_i时,

yif(xi)>0exp(yif(xi))>1=I(G(xi)yi)>0y_if(x_i)\gt 0 \Rightarrow \exp(-y_if(x_i))\gt 1 = I(G(x_i) \neq y_i) \gt 0

等式得证 (3):

1NiNexp(yif(xi))=1NiNexp(mMamyiGm(xi)),公式6=iw1iiNexp(mMamyiGm(xi)),公式1=Z1iw2i2Nexp(mMamyiGm(xi)),公式8.4=m=1MZm\begin{aligned} \frac{1}{N}\sum_{i}^N\exp(-y_if(x_i)) \\ = \frac{1}{N}\sum_{i}^N\exp(-\sum_{m}^Ma_my_iG_m(x_i)), \text {公式6} \\ = \sum_i w_{1i}\sum_{i}^N\exp(-\sum_{m}^Ma_my_iG_m(x_i)), \text {公式1} \\ = Z_1\sum_i w_{2i}\sum_{2}^N\exp(-\sum_{m}^Ma_my_iG_m(x_i)), \text {公式8.4} \\ = \prod_{m=1}^MZ_m \end{aligned}

(4):

Zm=i=1Nwmiexp(amyiGm(xi))=yi=Gm(xi)wmieam+yiGm(xi)wmieam,yiGm(xi)代表对样本i是否分类正确=(1em)exp(am)+emexp(am),公式3=(1em)em1em+(em)1emem,t公式4=2(1em)em\begin{aligned} Z_m = \sum_{i=1}^Nw_{mi}\exp (-a_my_iG_m(x_i)) \\ = \sum_{y_i=G_m(x_i)}w_{mi}e^{-a_m} + \sum_{y_i\neq G_m(x_i)}w_{mi}e^{a_m}, y_iG_m(x_i)\text{代表对样本i是否分类正确} \\ = (1-e_m)\exp(-a_m) + e_m\exp(a_m), \text{公式3} \\ = (1-e_m)\sqrt{\frac{e_m}{1-e_m}} + (e_m)\sqrt{\frac{1-e_m}{e_m}}, t\text{公式4} \\ = 2\sqrt{(1-e_m)e_m} \end{aligned}

(5):令γ=12em\gamma = \frac{1}{2}-e_m (6):【?】泰勒公式

Last updated

Was this helpful?