Auto Byte

Science AI

# 手工计算神经网络第三期：数据读取与完成训练

## 神经网络

### 数据集介绍

``train_img_path=r'C:\Users\Dell\MNIST\train-images.idx3-ubyte'``train_lab_path=r'C:\Users\Dell\MNIST\train-labels.idx1-ubyte'``test_img_path=r'C:\Users\Dell\MNIST\t10k-images.idx3-ubyte'``test_lab_path=r'C:\Users\Dell\MNIST\t10k-labels.idx1-ubyte'``

```import struct

train_num=50000
valid_num=10000
test_num=10000

with open(train_img_path,'rb') as f:
tmp_img=np.fromfile(f,dtype=np.uint8).reshape(-1,28*28)
train_img=tmp_img[:train_num]  #前五万个数据是训练集
valid_img=tmp_img[train_num:]  #第五万到第六万个数据是测试集

with open(test_img_path,'rb') as f:
test_img=np.fromfile(f,dtype=np.uint8).reshape(-1,28*28)

with open(train_lab_path,'rb') as f:
tmp_lab=np.fromfile(f,dtype=np.uint8)
train_lab=tmp_lab[:train_num]
valid_lab=tmp_lab[train_num:]

with open(test_lab_path,'rb') as f:
test_lab=np.fromfile(f,dtype=np.uint8)```

reshape(-1,28*28)：如果参数中存在-1，表示该参数由其他参数来决定.-1是将一维数组转换为二维的矩阵，并且第二个参数是表示每一行数的个数。

```import matplotlib.pyplot as plt
def show_train(index):
plt.imshow(train_img[index].reshape(28,28),cmap='gray')
print('label:{}'.format(train_lab[index]))
def show_test(index):
plt.imshow(train_img[index].reshape(28,28),cmap='gray')
print('label:{}'.format(test_lab[index]))
def valid_train(index):
plt.imshow(valid_img[index].reshape(28,28),cmap='gray')
print('label:{}'.format(valid_lab[index]))```

## 训练数据

```def tanh(x):
return np.tanh(x)

def softmax(x):
exp = np.exp(x-x.max())
return exp/exp.sum()
dimensions = [28*28,10]
activation = [tanh,softmax]
distribution=[
{
'b':[0,0]
},{
'b':[0,0],
'w':[-math.sqrt(6/(dimensions[0]+dimensions[1])),math.sqrt(6/(dimensions[0]+dimensions[1]))]
}]

# 初始化参数b
def init_parameters_b(layer):
dist = distribution[layer]['b']
return np.random.rand(dimensions[layer])*(dist[1]-dist[0])+dist[0]
# 初始化参数w
def init_parameters_w(layer):
dist = distribution[layer]['w']
return np.random.rand(dimensions[layer-1],dimensions[layer])*(dist[1]-dist[0])+dist[0]

#初始化参数方法
def init_parameters():
parameter=[]
for i in range(len(distribution)):
layer_parameter={}
for j in distribution[i].keys():
if j=='b':
layer_parameter['b'] = init_parameters_b(i)
continue;
if j=='w':
layer_parameter['w'] = init_parameters_w(i)
continue
parameter.append(layer_parameter)
return parameter

# 预测函数
def predict(img,init_parameters):
l0_in = img+parameters[0]['b']
l0_out = activation[0](l0_in)
l1_in = np.dot(l0_out,parameters[1]['w'])+parameters[1]['b']
l1_out = activation[1](l1_in)
return l1_out```

```def d_softmax(data):
sm = softmax(data）
return np.diag(sm)-np.outer(sm,sm)

def d_tanh(data):
return 1/(np.cosh(data))**2

differential = {softmax:d_softmax,tanh:d_tanh}```

````differential = {softmax:d_softmax,tanh:d_tanh}`
`onehot = np.identity(dimensions[-1])````

````def sqr_loss(img,lab,parameters):``  `
`  y_pred = predict(img,parameters)``  `
`  y = onehot[lab]``  `
`  diff = y-y_pred``  `
`  return np.dot(diff,diff)````

### 计算梯度

```def train_batch(current_batch,parameters):
for img_i in range(1,batch_size):

import copy
parameter_tmp = copy.deepcopy(parameters)
return parameter_tmp```

```def train_batch(current_batch,parameters):
for img_i in range(1,batch_size):

import copy
parameter_tmp = copy.deepcopy(parameters)
return parameter_tmp```

```def learn_self(learn_rate):
for i in range(train_num//batch_size):
if i%100 == 99:
print("running batch {}/{}".format(i+1,train_num//batch_size))
global parameters

```def valid_loss(parameters):
loss_accu = 0
for img_i in range(valid_num):
loss_accu+=sqr_loss(valid_img[img_i],valid_lab[img_i],parameters)
return loss_accu```

````def valid_accuracy(parameters):``  `
`   correct = [predict(valid_img[img_i],parameters).argmax()==valid_lab[img_i] for img_i in range(valid_num) ]``  `
`   print("validation accuracy:{}".format(correct.count(True)/len(correct)))````

*注：此篇文章受B站up主大野喵渣的启发，并参考了其代码，感兴趣的同学可以去B站观看他关于神经网络的教学视频，以及到他的Github地址逛逛。

https://www.bilibili.com/video/av51197008

https://github.com/YQGong