证明:对偶函数的极大化=模型的极大似然估计

模型的极大似然估计

条件概率分布P(Y|X)的对数似然函数表示为:

\begin{aligned}
L_{\tilde P}(P_w) = \log \prod_{x,y}P(y|x)^{\tilde P(x, y)} \\
= \sum_{x,y}\tilde P(x, y)\log P(y|x) && {1}
\end{aligned}

当条件概率分布P(Y|X)是最大熵模型时,即

P(yx)=exp(i=0n)wi+1fi(x,y)exp(1w+0)2\begin{aligned} P(y|x) = \frac{exp(\sum_{i=0}^n)w_{i+1}f_i(x, y)}{exp(1-w+0)} && {2} \end{aligned}

将公式(2)代入公式(1)得:

LP~(Pw)=x,yP~(x,y)i=0nwi+1fi(x,y)xP~(x)logZw(x)L_{\tilde P}(P_w) = \sum_{x,y} \tilde P(x, y) \sum_{i=0}^n w_{i+1}f_i(x, y) - \sum_x \tilde P(x) \log Z_w(x)

对偶函数的极大化

对偶函数定义如下:

Ψ(w)=L(Pw,w)=x,yP~(x)Pw(yx)logPw(yx)=x,yP~(x,y)i=0nwi+1fi(x,y)xP~(x)logZw(x)\begin{aligned} \Psi (w) = L(P_w, w) \\ = \sum_{x,y}\tilde P(x) P_w(y|x)\log P_w(y|x) \\ = \sum_{x,y} \tilde P(x, y) \sum_{i=0}^n w_{i+1}f_i(x, y) - \sum_x \tilde P(x) \log Z_w(x) \end{aligned}

结论

LP~(Pw)=Ψ(w)L_{\tilde P}(P_w) = \Psi (w)

即:对偶函数的极大化=模型的极大似然估计

Last updated

Was this helpful?