【飞桨开发者说】王成,深度学习爱好者,淮阴师范学院,研究方向为计算机视觉图像与视频处理。
![](https://image.jiqizhixin.com/uploads/editor/d6139e94-981c-49b2-ac5c-1f4745474a19/640.png)
![](https://image.jiqizhixin.com/uploads/editor/1548f7ba-678c-4c84-b9f2-d2e82a6aa994/640.png)
卷积神经网络的不足之处
![](https://image.jiqizhixin.com/uploads/editor/3bfc9171-e8e5-4f96-a5ed-4f2274823e6c/640.png)
![](https://image.jiqizhixin.com/uploads/editor/7855549b-6d55-426c-8934-800548a03298/640.png)
![](https://image.jiqizhixin.com/uploads/editor/11f3f8ae-e768-4596-8330-7fdb49549444/640.png)
![](https://image.jiqizhixin.com/uploads/editor/3286afdd-0d7d-4848-a72b-0c515bd80458/640.png)
胶囊的工作原理
![](https://image.jiqizhixin.com/uploads/editor/88fb5a4a-b7b4-4685-83ec-06263cc8f697/640.png)
![](https://image.jiqizhixin.com/uploads/editor/93ec33fb-ada9-4469-aa7f-3637f3c4f040/640.png)
![](https://image.jiqizhixin.com/uploads/editor/b4e46994-3449-4a13-9be5-7d4c673f0862/640.png)
![](https://image.jiqizhixin.com/uploads/editor/94daa435-3d76-4f13-94dd-f8b6c9cb39ec/640.png)
![](https://image.jiqizhixin.com/uploads/editor/1ee2fbbf-1420-4ab4-9c30-f28c3837769d/640.png)
def squash(self,vector): ''' 压缩向量的函数,类似激活函数,向量归一化 Args: vector:一个4维张量 [batch_size,vector_num,vector_units_num,1] Returns: 一个和x形状相同,长度经过压缩的向量 输入向量|v|(向量长度)越大,输出|v|越接近1 ''' vec_abs = fluid.layers.sqrt(fluid.layers.reduce_sum(fluid.layers.square(vector))) scalar_factor = fluid.layers.square(vec_abs) / (1 + fluid.layers.square(vec_abs)) vec_squashed = scalar_factor * fluid.layers.elementwise_div(vector, vec_abs) return(vec_squashed)
囊间动态路由(精髓所在)
![](https://image.jiqizhixin.com/uploads/editor/ba5bc89b-24e0-4e76-a2ce-3022c3fcb91d/640.png)
伪代码的第一行指明了算法的输入:低层输入向量经过矩阵乘法得到的û,以及路由迭代次数r。最后一行指明了算法的输出,高层胶囊的向量Vj。 第2行的bij是一个临时变量,存放了低层向量对高层胶囊的权重,它的值会在迭代过程中逐个更新,当开始一轮迭代时,它的值经过softmax转换成cij。在囊间动态路由算法开始时,bij的值被初始化为零(但是经过softmax后会转换成非零且各个权重相等的cij)。 第3行表明第4-7行的步骤会被重复r次(路由迭代次数)。 第4行计算低层胶囊向量的对应所有高层胶囊的权重。bi的值经过softmax后会转换成非零权重ci且其元素总和等于1。 如果是第一次迭代,所有系数cij的值会相等。例如,如果我们有8个低层胶囊和10个高层胶囊,那么所有cij的权重都将等于0.1。这样初始化使不确定性达到最大值:低层胶囊不知道它们的输出最适合哪个高层胶囊。当然,随着这一进程的重复,这些均匀分布将发生改变。 第5行,那里将涉及高层胶囊。这一步计算经前一步确定的路由系数加权后的输入向量的总和,得到输出向量sj。 第7行进行更新权重,这是路由算法的精髓所在。我们将每个高层胶囊的向量vj与低层原来的输入向量û逐元素相乘求和获得内积(也叫点积,点积检测胶囊的输入和输出之间的相似性(下图为示意图)),再用点积结果更新原来的权重bi。这就达到了低层胶囊将其输出发送给具有类似输出的高层胶囊的效果,刻画了向量之间的相似性。这一步骤之后,算法跳转到第3步重新开始这一流程,并重复r次。
▲ 点积运算即为向量的内积(点积)运算,
class Capsule_Layer(fluid.dygraph.Layer): def __init__(self,pre_cap_num,pre_vector_units_num,cap_num,vector_units_num): ''' 胶囊层的实现类,可以直接同普通层一样使用 Args: pre_vector_units_num(int):输入向量维度 vector_units_num(int):输出向量维度 pre_cap_num(int):输入胶囊数 cap_num(int):输出胶囊数 routing_iters(int):路由迭代次数,建议3次 Notes: 胶囊数和向量维度影响着性能,可作为主调参数 ''' super(Capsule_Layer,self).__init__() self.routing_iters = 3 self.pre_cap_num = pre_cap_num self.cap_num = cap_num self.pre_vector_units_num = pre_vector_units_num for j in range(self.cap_num): self.add_sublayer('u_hat_w'+str(j),fluid.dygraph.Linear(\ input_dim=pre_vector_units_num,output_dim=vector_units_num)) def squash(self,vector): ''' 压缩向量的函数,类似激活函数,向量归一化 Args: vector:一个4维张量 [batch_size,vector_num,vector_units_num,1] Returns: 一个和x形状相同,长度经过压缩的向量 输入向量|v|(向量长度)越大,输出|v|越接近1 ''' vec_abs = fluid.layers.sqrt(fluid.layers.reduce_sum(fluid.layers.square(vector))) scalar_factor = fluid.layers.square(vec_abs) / (1 + fluid.layers.square(vec_abs)) vec_squashed = scalar_factor * fluid.layers.elementwise_div(vector, vec_abs) return(vec_squashed) def capsule(self,x,B_ij,j,pre_cap_num): ''' 这是动态路由算法的精髓。 Args: x:输入向量,一个四维张量 shape = (batch_size,pre_cap_num,pre_vector_units_num,1) B_ij: shape = (1,pre_cap_num,cap_num,1)路由分配权重,这里将会选取(split)其中的第j组权重进行计算 j:表示当前计算第j个胶囊的路由 pre_cap_num:输入胶囊数 Returns: v_j:经过多次路由迭代之后输出的4维张量(单个胶囊) B_ij:计算完路由之后又拼接(concat)回去的权重 Notes: B_ij,b_ij,C_ij,c_ij注意区分大小写哦 ''' x = fluid.layers.reshape(x,(x.shape[0],pre_cap_num,-1)) u_hat = getattr(self,'u_hat_w'+str(j))(x) u_hat = fluid.layers.reshape(u_hat,(x.shape[0],pre_cap_num,-1,1)) shape_list = B_ij.shape#(1,1152,10,1) split_size = [j,1,shape_list[2]-j-1] for i in range(self.routing_iters): C_ij = fluid.layers.softmax(B_ij,axis=2) b_il,b_ij,b_ir = fluid.layers.split(B_ij,split_size,dim=2) c_il,c_ij,b_ir = fluid.layers.split(C_ij,split_size,dim=2) v_j = fluid.layers.elementwise_mul(u_hat,c_ij) v_j = fluid.layers.reduce_sum(v_j,dim=1,keep_dim=True) v_j = self.squash(v_j) v_j_expand = fluid.layers.expand(v_j,(1,pre_cap_num,1,1)) u_v_produce = fluid.layers.elementwise_mul(u_hat,v_j_expand) u_v_produce = fluid.layers.reduce_sum(u_v_produce,dim=2,keep_dim=True) b_ij += fluid.layers.reduce_sum(u_v_produce,dim=0,keep_dim=True) B_ij = fluid.layers.concat([b_il,b_ij,b_ir],axis=2) return v_j,B_ij def forward(self,x): ''' Args: x:shape = (batch_size,pre_caps_num,vector_units_num,1) or (batch_size,C,H,W) 如果是输入是shape=(batch_size,C,H,W)的张量, 则将其向量化shape=(batch_size,pre_caps_num,vector_units_num,1) 满足:C * H * W = vector_units_num * caps_num 其中 C >= caps_num Returns: capsules:一个包含了caps_num个胶囊的list ''' if x.shape[3]!=1: x = fluid.layers.reshape(x,(x.shape[0],self.pre_cap_num,-1)) temp_x = fluid.layers.split(x,self.pre_vector_units_num,dim=2) temp_x = fluid.layers.concat(temp_x,axis=1) x = fluid.layers.reshape(temp_x,(x.shape[0],self.pre_cap_num,-1,1)) x = self.squash(x) B_ij = fluid.layers.ones((1,x.shape[1],self.cap_num,1),dtype='float32')/self.cap_num# capsules = [] for j in range(self.cap_num): cap_j,B_ij = self.capsule(x,B_ij,j,self.pre_cap_num) capsules.append(cap_j) capsules = fluid.layers.concat(capsules,axis=1) return capsules
损失函数
![](https://image.jiqizhixin.com/uploads/editor/1487f306-7eca-4203-a922-ce6d79c0df0d/640.png)
def get_loss_v(self,label): ''' 计算边缘损失 Args: label:shape=(32,10) one-hot形式的标签 Notes: 这里我调用Relu把小于0的值筛除 m_plus:正确输出项的概率(|v|)大于这个值则loss为0,越接近则loss越小 m_det:错误输出项的概率(|v|)小于这个值则loss为0,越接近则loss越小 (|v|即胶囊(向量)的模长) ''' #计算左项,虽然m+是单个值,但是可以通过广播的形式与label(32,10)作差 max_l = fluid.layers.relu(train_params['m_plus'] - self.output_caps_v_lenth) #平方运算后reshape max_l = fluid.layers.reshape(fluid.layers.square(max_l),(train_params['batch_size'],-1))#32,10 #同样方法计算右项 max_r = fluid.layers.relu(self.output_caps_v_lenth - train_params['m_det']) max_r = fluid.layers.reshape(fluid.layers.square(max_r),(train_params['batch_size'],-1))#32,10 #合并的时候直接用one-hot形式的标签逐元素乘算便可 margin_loss = fluid.layers.elementwise_mul(label,max_l)\ + fluid.layers.elementwise_mul(1-label,max_r)*train_params['lambda_val'] self.margin_loss = fluid.layers.reduce_mean(margin_loss,dim=1)
编码器
![](https://image.jiqizhixin.com/uploads/editor/3d1bf163-abee-43fc-bdcc-dcca839520f2/640.png)
class Capconv_Net(fluid.dygraph.Layer): def __init__(self): super(Capconv_Net,self).__init__() self.add_sublayer('conv0',fluid.dygraph.Conv2D(\ num_channels=1,num_filters=256,filter_size=(9,9),padding=0,stride = 1,act='relu')) for i in range(8): self.add_sublayer('conv_vector_'+str(i),fluid.dygraph.Conv2D(\ num_channels=256,num_filters=32,filter_size=(9,9),stride=2,padding=0,act='relu')) def forward(self,x,v_units_num): x = getattr(self,'conv0')(x) capsules = [] for i in range(v_units_num): temp_x = getattr(self,'conv_vector_'+str(i))(x) capsules.append(fluid.layers.reshape(temp_x,(train_params['batch_size'],-1,1,1))) x = fluid.layers.concat(capsules,axis=2) x = self.squash(x) return x
![](https://image.jiqizhixin.com/uploads/editor/c8b2a827-a64a-432f-9119-c4d308e6ab42/640.png)
解码器
![](https://image.jiqizhixin.com/uploads/editor/659e965a-9e49-45ef-a7b3-ee484927218a/640.png)
![](https://image.jiqizhixin.com/uploads/editor/19f5a8ff-259d-4f0d-9272-1b8254bf106f/640.png)
性能评估
![](https://image.jiqizhixin.com/uploads/editor/dc197d67-123f-4301-b2fa-a01c4e19cfcd/640.png)
![](https://image.jiqizhixin.com/uploads/editor/6b37dcac-3a70-4317-82a0-71d5fe389bb6/640.png)
![](https://image.jiqizhixin.com/uploads/editor/a334b6c3-a814-46c9-8e7c-90a2276ee1db/640.png)
如果您想详细了解更多飞桨的相关内容,请参阅以下文档。
GitHub:
Gitee: