Auto Byte

Science AI

# 业界 | 谷歌开源DeepLearn.js：可在网页上实现硬件加速的机器学习

deeplearn.js 是一个可用于机器智能并加速 WebGL 的开源 JavaScript 库。deeplearn.js 提供高效的机器学习构建模块，使我们能够在浏览器中训练神经网络或在推断模式中运行预训练模型。它提供构建可微数据流图的 API，以及一系列可直接使用的数学函数。

### 核心概念

NDArrays

deeplearn.js 的核心数据单元是 NDArray。NDArray 包括一系列浮点值，它们将构建为任意维数的数组。NDArray 具备一个用来定义形状的 shape 属性。该库为低秩 NDArray 提供糖子类（sugar subclasses）：Scalar、Array1D、Array2D、Array3D 和 Array4D。

2x3 矩阵的用法示例：

``````const shape = [2, 3];  // 2 rows, 3 columns
const a = Array2D.new(shape, [1.0, 2.0, 3.0, 10.0, 20.0, 30.0]);``````

NDArray 可作为 WebGLTexture 在 GPU 上储存数据，每一个像素储存一个浮点值；或者作为 vanilla JavaScript TypedArray 在 CPU 上储存数据。大多数时候，用户不应思考储存问题，因为这只是一个实现细节。

NDArrayMath

NDArrayMathGPU

``````const math = new NDArrayMathGPU();

math.scope((keep, track) => {
const a = track(Array2D.new([2, 2], [1.0, 2.0, 3.0, 4.0]));
const b = track(Array2D.new([2, 2], [0.0, 2.0, 4.0, 6.0]));

// Non-blocking math calls.
const diff = math.sub(a, b);
const squaredDiff = math.elementWiseMul(diff, diff);
const sum = math.sum(squaredDiff);
const size = Scalar.new(a.size);
const average = math.divide(sum, size);

// Blocking call to actually read the values from average. Waits until the
// GPU has finished executing the operations before returning values.
console.log(average.get());  // average is a Scalar so we use .get()
});``````

math.scope()

• keep() 确保了 NDArray 将得到传递并保留，它不会在作用域范围结束后被自动清除。
• track() 追踪了我们在作用域内直接构建的 NDArray。当作用域结束时，任何手动追踪的 NDArray 都将会被清除。math.method() 函数的结果和其它核心库函数的结果一样将会被自动清除，所以我们也不必手动追踪它们。

``````const math = new NDArrayMathGPU();

let output;

// You must have an outer scope, but don't worry, the library will throw an
// error if you don't have one.
math.scope((keep, track) => {
// CORRECT: By default, math wont track NDArrays that are constructed
// directly. You can call track() on the NDArray for it to get tracked and
// cleaned up at the end of the scope.
const a = track(Scalar.new(2));

// INCORRECT: This is a texture leak!!
// math doesn't know about b, so it can't track it. When the scope ends, the
// GPU-resident NDArray will not get cleaned up, even though b goes out of
// scope. Make sure you call track() on NDArrays you create.
const b = Scalar.new(2);

// CORRECT: By default, math tracks all outputs of math functions.
const c = math.neg(math.exp(a));

// CORRECT: d is tracked by the parent scope.
const d = math.scope(() => {
// CORRECT: e will get cleaned up when this inner scope ends.
const e = track(Scalar.new(3));

// CORRECT: The result of this math function is tracked. Since it is the
// return value of this scope, it will not get cleaned up with this inner
// scope. However, the result will be tracked automatically in the parent
// scope.
return math.elementWiseMul(e, e);
});

// CORRECT, BUT BE CAREFUL: The output of math.tanh will be tracked
// automatically, however we can call keep() on it so that it will be kept
// when the scope ends. That means if you are not careful about calling
// output.dispose() some time later, you might introduce a texture memory
// leak. A better way to do this would be to return this value as a return
// value of a scope so that it gets tracked in a parent scope.
output = keep(math.tanh(d));
});``````

NDArrayMathCPU

Graph 对象是构建数据流图的核心类别，Graph 对象实际上并不保留 NDArray 数据，它只是在运算中构建连接。

Graph 类像顶层成员函数（member function）一样有可微分运算。当我们调用一个图方法来添加运算时，我们就会获得一个 Tensor 对象，它仅仅保持连通性和形状信息。

``````const g = new Graph();

// Placeholders are input containers. This is the container for where we will
// feed an input NDArray when we execute the graph.
const inputShape = [3];
const inputTensor = g.placeholder('input', inputShape);

const labelShape = [1];
const inputTensor = g.placeholder('label', labelShape);

// Variables are containers that hold a value that can be updated from training.
// Here we initialize the multiplier variable randomly.
const multiplier = g.variable('multiplier', Array2D.randNormal([1, 3]));

// Top level graph methods take Tensors and return Tensors.
const outputTensor = g.matmul(multiplier, inputTensor);
const costTensor = g.meanSquaredCost(outputTensor, labelTensor);

// Tensors, like NDArrays, have a shape attribute.
console.log(outputTensor.shape);``````

Session 和 FeedEntry

Session 对象是驱动执行计算图的方法，FeedEntry 对象（和 TensorFlow 中的 feed_dict 类似）将提供运行所需的数据，并从给定的 NDArray 中馈送一个值给 Tensor 对象。

``````const learningRate = .001;
const batchSize = 2;

const math = new NDArrayMathGPU();
const session = new Session(g, math);
const optimizer = new SGDOptimizer(learningRate);

const inputs: Array1D[] = [
Array1D.new([1.0, 2.0, 3.0]),
Array1D.new([10.0, 20.0, 30.0]),
Array1D.new([100.0, 200.0, 300.0])
];

const labels: Array1D[] = [
Array1D.new([2.0, 6.0, 12.0]),
Array1D.new([20.0, 60.0, 120.0]),
Array1D.new([200.0, 600.0, 1200.0])
];

// Shuffles inputs and labels and keeps them mutually in sync.
const shuffledInputProviderBuilder =
new InCPUMemoryShuffledInputProviderBuilder([inputs, labels]);
const [inputProvider, labelProvider] =
shuffledInputProviderBuilder.getInputProviders();

// Maps tensors to InputProviders.
const feedEntries: FeedEntry[] = [
{tensor: inputTensor, data: inputProvider},
{tensor: labelTensor, data: labelProvider}
];

// Wrap session.train in a scope so the cost gets cleaned up automatically.
math.scope(() => {
// Train takes a cost tensor to minimize. Trains one batch. Returns the
// average cost as a Scalar.
const cost = session.train(
costTensor, feedEntries, batchSize, optimizer, CostReduction.MEAN);

console.log('last average cost: ' + cost.get());
});``````

``````// Wrap session.eval in a scope so the intermediate values get cleaned up
// automatically.
math.scope((keep, track) => {
const testInput = track(Array1D.new([1.0, 2.0, 3.0]));

// session.eval can take NDArrays as input data.
const testFeedEntries: FeedEntry[] = [
{tensor: inputTensor, data: testInput}
];

const testOutput = session.eval(outputTensor, testFeedEntries);

console.log('inference output:');
console.log(testOutput.shape);
console.log(testOutput.getValues());
});``````