钟蔚弘本期编辑张伟男 本期责任编辑

EMNLP 2020 | 基于反事实推理的开放域生成式对话

论文名称:Counterfactual Off-Policy Training for Neural Dialogue Generation
论文作者:朱庆福,张伟男,刘挺,王威廉
原创作者:朱庆福
论文链接:https://arxiv.org/abs/2004.14507
转载须标注出处:哈工大SCIR
1. 简介

2. 模型结构

2.1 结构因果模型(Structural Causal Model)

2.2 干预(Intervention)

2.3 反事实推理(Counterfactual Inference)

3. 实验结果

4. 实验分析

5. 结论

参考文献

[1] Judea Pearl and Dana Mackenzie. 2018. The book of why: the new science of cause and effect. Basic Books.

[2] Lars Buesing, Theophane Weber, Yori Zwols, Nicolas Heess, Sebastien Racaniere, Arthur Guez, and Jean Baptiste Lespiau. 2019. Woulda, coulda, shoulda: Counterfactually-guided policy search. In Proceedings of the Seventh International Conference on Learning Representations.

[3] Michael Oberst and David Sontag. 2019. Counterfactual off-policy evaluation with gumbel-max structural causal models. In International Conference on Machine Learning, pages 4881–4890.

[4] Iulian V Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.

[5] Jingjing Xu, Xuancheng Ren, Junyang Lin, and Xu Sun. 2018. Diversity-promoting GAN: A cross-entropy based generative adversarial network for diversified text generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3940–3949.

[6] Jiwei Li, Will Monroe, Tianlin Shi, Se ́bastien Jean, Alan Ritter, and Dan Jurafsky. 2017a. Adversarial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2157–2169.

[7] Yi-Lin Tuan and Hung-Yi Lee. 2019. Improving conditional sequence generative adversarial networks by stepwise evaluation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(4):788–798.


哈工大SCIR
哈工大SCIR

哈尔滨工业大学社会计算与信息检索研究中心

理论生成式对话EMNLP 2020
相关数据
刘挺人物

哈工大人工智能研究院副院长,国内NLP方向领军人物。

对抗训练技术

对抗训练涉及两个模型的联合训练:一个模型是生成器,学习生成假样本,目标是骗过另一个模型;这另一个模型是判别器,通过对比真实数据学习判别生成器生成样本的真伪,目标是不要被骗。一般而言,两者的目标函数是相反的。

推荐文章
暂无评论
暂无评论~