赵天琪作者

论文笔记(3)

<Recurrent neural network based language model>

1.Define language model: The link below introduces the definition of language model, which is P(S), we all know the probability theory:

Naive Bayes ignores the context, which means:

This is called unigram, similarly, if the context equals one, then it is called bigram, and if the context equals two, then it is called trigram. We also have n-gram.

This is a great article introducing language model: 漫谈 Language Model (1): 原理篇

2. This paper introduces RNNLM. We usually call it simple recurrent neural network or Elman network. x(t) is input layer, s(t) is hidden layer (or state), y(t) is output layer, w(t) means current word. They are all vectors. This plus notation in the first line means concatenating since they have different dimensions. f and g means sigmoid and softmax function. Obviously y(t) is a probability distribution of next word.

And the error function is easy:

desired(t) is a one-hot vector representing the ground truth.

3. For rare words, we have:

If w(t+1) is rare, it obeys uniform distribution, otherwise we refer to y_i(t)

入门时间递归神经网络
1
相关数据
时间递归神经网络技术
Recurrent neural network

循环神经网络(RNN)是一类擅长处理序列数据的神经网络,其单元连接形成一个有向环。一般人工神经网络(ANN)由多层神经元组成,典型的连接方式是在前馈神经网络中,仅存在层与层之间的互相连接,而同层神经元之间没有连接。RNN在此基础上结合了隐藏层的循环连接,从而能从序列或时序数据中学习特征和长期依赖关系。RNN隐藏层的每一单独计算单元对应了数据中某个时间节点的状态,它可以是简单神经元、神经元层或各式的门控系统。 每一单元通过参数共享的层间顺序连接,并随着数据序列传播。这一特性使得RNN中每一单元的状态都取决于它的过去状态,从而具有类似“记忆”的功能,可以储存并处理长时期的数据信号。 大多数RNN能处理可变长度的序列,理论上也可以建模任何动态系统。

BlueCat的窝
BlueCat的窝

关注机器学习,深度学习,自然语言处理,强化学习等人工智能新技术。

BlueCatの窝
BlueCatの窝

关注机器学习,深度学习,自然语言处理,强化学习等人工智能新技术。

推荐文章
研学社 · 入门组 | 第九期:智能的哲思研学社 · 入门组 | 第九期:智能的哲思
Synced 深度研学社Synced 深度研学社
人工智能,机器学习和数据是未来生产力的驱动力人工智能,机器学习和数据是未来生产力的驱动力
机器之心机器之心
深度学习 + OpenCV,Python实现实时视频目标检测深度学习 + OpenCV,Python实现实时视频目标检测
李亚洲李亚洲
9
返回顶部