Auto Byte

Science AI

# 机器学习算法如何调参？这里有一份神经网络学习速率设置指南

Smith 写到，周期性学习率背后的主要理论假设（与降低学习率相对反）是「增加学习率也许有一个短期的负面影响，但却获得了长期的正面影响」。确实，他的论文包含若干个损失函数演化的实例，与基准固定学习率相比，它们暂时偏离到较高的损失，并最终收敛到较低的损失。

1. 当训练进行时延长周期

2. 在每一周期之后衰减  和

````from keras.callbacks import Callback``import matplotlib.pyplot as plt`
`class LRFinder(Callback):`
`    '''``    A simple callback for finding the optimal learning rate range for your model + dataset. `
`    # Usage`
`        ```python``            lr_finder = LRFinder(min_lr=1e-5, max_lr=1e-2, steps_per_epoch=10, epochs=3)``            model.fit(X_train, Y_train, callbacks=[lr_finder])`
`            lr_finder.plot_loss()``        ````
`    # Arguments``        min_lr: The lower bound of the learning rate range for the experiment.``        max_lr: The upper bound of the learning rate range for the experiment.``        steps_per_epoch: Number of mini-batches in the dataset.``        epochs: Number of epochs to run experiment. Usually between 2 and 4 epochs is sufficient. `
`    # References`
`        Blog post: jeremyjordan.me/nn-learning-rate``        Original paper: https://arxiv.org/abs/1506.01186``    '''`
`    def __init__(self, min_lr=1e-5, max_lr=1e-2, steps_per_epoch=None, epochs=None):``        super().__init__()`
`        self.min_lr = min_lr``        self.max_lr = max_lr``        self.total_iterations = steps_per_epoch * epochs``        self.iteration = 0``        self.history = {}`
`    def clr(self):``        '''Calculate the learning rate.'''``        x = self.iteration / self.total_iterations ``        return self.min_lr + (self.max_lr-self.min_lr) * x`
`    def on_train_begin(self, logs=None):``        '''Initialize the learning rate to the minimum value at the start of training.'''``        logs = logs or {}``        K.set_value(self.model.optimizer.lr, self.min_lr)`
`    def on_batch_end(self, epoch, logs=None):``        '''Record previous batch statistics and update the learning rate.'''``        logs = logs or {}``        self.iteration += 1`
`        K.set_value(self.model.optimizer.lr, self.clr())`
`        self.history.setdefault('lr', []).append(K.get_value(self.model.optimizer.lr))``        self.history.setdefault('iterations', []).append(self.iteration)`
`        for k, v in logs.items():``            self.history.setdefault(k, []).append(v)`
`    def plot_lr(self):``        '''Helper function to quickly inspect the learning rate schedule.'''``        plt.plot(self.history['iterations'], self.history['lr'])``        plt.yscale('log')``        plt.xlabel('Iteration')``        plt.ylabel('Learning rate')`
`    def plot_loss(self):``        '''Helper function to quickly observe the learning rate experiment results.'''``        plt.plot(self.history['lr'], self.history['loss'])``        plt.xscale('log')``        plt.xlabel('Learning rate')``        plt.ylabel('Loss')````

````import numpy as np``from keras.callbacks import LearningRateScheduler`
`def step_decay_schedule(initial_lr=1e-3, decay_factor=0.75, step_size=10):``    '''``    Wrapper function to create a LearningRateScheduler with step decay schedule.``    '''``    def schedule(epoch):``        return initial_lr * (decay_factor ** np.floor(epoch/step_size))`
`    return LearningRateScheduler(schedule)`
`lr_sched = step_decay_schedule(initial_lr=1e-4, decay_factor=0.75, step_size=2)`
`model.fit(X_train, Y_train, callbacks=[lr_sched])````