Auto Byte

Science AI

# 交叉熵

[描述来源：如何理解KL散度的不对称性|机器之心]

P(y_i|z^(l-1))=\hat(y_i^(L+1))y_i (1-\hat(y_i^(L+1))^(1-y_i)logP(y_i|z^(l-1))=log(\hat(y_i^(L+1))y_i (1-\hat(y_i^(L+1))^(1-y_i)) =(y_i)log(\hat(y_i^(L+1)) + (1-y_i)log(1-\hat(y_t^(L+1))

logP(y_i|z^(l-1))=-log((\hat(y_i^(L+1))^(y_i)(1-\hat(y_i^(L+1))^(1-y_i))

L(\hat(y_i^(L+1)),y_i)=-\frac{1}{t}\sum_t^(i=t)((y_i)log(\hat(y_i^(L+1)))+(1-y_i)log(1-\hat(y_i^(L+1)))

[描述来源：徒手实现CNN：综述论文详解卷积网络的数学本质|机器之心]

## 发展历史

 年份 事件 相关论文/Reference 1948 克劳德·艾尔伍德·香农将热力学的熵，引入到信息论 Shannon, C. E. (1948).A Mathematical Theory of Communication.Bell system technical journal. 1999 Reuven Y.Rubinstein提出交叉熵 Rubinstein, R. Y. (1999). The simulated entropy method for combinatorial and continuous optimization. Methodology and Computing in Applied Probability, 2, 127–190. 2003 交叉熵方法成功应用于动态模型中稀有事件概率的估计，特别是涉及轻，重尾输入分布的排队模型 Kroese, D. and Rubinstein, R. (2003). The Transform Likelihood Ratio Method for Rare Event Simulation with Heavy Tails. Queueing Systems. 46(3-4): 317–351. 2004 Ishai Menache，Shie Mannor，Nahum Shimkin将交叉熵用于强化学习强化学习 Menache, I., Mannor, S., and Shimkin, N. (2004). Basis Function Adaption in Temporal Difference Reinforcement Learning. Annals of Operations Research. Submitted.

## 发展分析

### 未来发展方向

Contributor: Yuanyuan Li