Add WebGPU backend. #1676

annxingyuan · 2019-04-13T21:24:26Z

This PR copies the contents of this repo into src/backends/webgpu.

Also excludes src/backends/webgpu from tsconfig.json and karma.conf.js. I confirmed that the rollup build does not include the WebGPU backend.

Full discussion here (the original PR).

This change is

Kangz

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan)

src/backends/webgpu/chromium.sh, line 2 at r1 (raw file):

#!/bin/bash
exec ./../../../../Chromium.app/Contents/MacOS/Chromium --enable-unsafe-webgpu "$@"

nit: this seems hardcoded for your configuration. Maybe have an env variable to override it?

src/backends/webgpu/README.md, line 6 at r1 (raw file):

# TensorFlow.js WebGPU backend

This repo contains a standalone WebGPU backend that implements the TensorFlow.js `KernelBackend` interface.

These comments seem outdated because this PR is in the main tfjs-core.

src/backends/webgpu/src/backend_webgpu.ts, line 80 at r1 (raw file):

  private setBufferData(buffer: any, data: Float32Array|Int32Array|Uint8Array) {
    buffer.setSubData(0, data.slice().buffer);

FYI this API might be removed but we'll have a replacement for it.

src/backends/webgpu/src/backend_webgpu.ts, line 118 at r1 (raw file):

    const data = mapped.slice(0);
    info.buffer.unmap();

info.buffer is not mappable so this isn't needed.

src/backends/webgpu/src/backend_webgpu.ts, line 156 at r1 (raw file):

    });

    // TODO: bind groups should be cached.

Creating bindgroups on the fly will probably never be a bottleneck for tfjs.

src/backends/webgpu/src/backend_webgpu.ts, line 179 at r1 (raw file):

    pass.dispatch(...program.dispatch);
    pass.endPass();
    this.queue.submit([encoder.finish()]);

Ideally we are able to do a single submit for everything. Submits tend to be expensive operations.

src/backends/webgpu/src/backend_webgpu.ts, line 209 at r1 (raw file):

    const mnkb =
        tensor1d([outerShapeA, sharedDim, outerShapeB, batch], 'int32');
    // TODO: dispose mnkb

You can use GPUBuffer.destroy to dispose of the data.

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 27 at r1 (raw file):

  constructor(outputShape: [number, number, number]) {
    this.outputShape = outputShape;
    const tileSize = 2;

Gains would probably come with bigger tile sizes.

src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

/**

nit: webgpu_shader.ts / webgpu_program.ts?

annxingyuan

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, and @nsthorat)

src/backends/webgpu/chromium.sh, line 2 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

nit: this seems hardcoded for your configuration. Maybe have an env variable to override it?

Done

src/backends/webgpu/README.md, line 6 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

These comments seem outdated because this PR is in the main tfjs-core.

Done

src/backends/webgpu/src/backend_webgpu.ts, line 118 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

info.buffer is not mappable so this isn't needed.

Done

src/backends/webgpu/src/backend_webgpu.ts, line 179 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Ideally we are able to do a single submit for everything. Submits tend to be expensive operations.

Yes - will add a toggle for retained mode in future PR.

src/backends/webgpu/src/backend_webgpu.ts, line 209 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

You can use GPUBuffer.destroy to dispose of the data.

Got it.

src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

nit: webgpu_shader.ts / webgpu_program.ts?

Are you referring to the name of this file being webgpu_math? This is to mimic our webgl backend setup.

dsmilkov

LGTM with a few minor comments.

Reviewed 13 of 21 files at r1, 1 of 5 files at r2.
Reviewable status: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, and @nsthorat)

src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

#!/bin/bash

use an absolute path instead of relative, so it works on all macs

src/backends/webgpu/tsconfig.json, line 4 at r1 (raw file):

  "compilerOptions": {
    "module": "commonjs",
    "moduleResolution": "node",

instead of repeating all of these settings, use "extends": "../../tsconfig" to extend from the top-level tsconfig. Note that you still need to add include but you can skip compilerOptions: https://www.typescriptlang.org/docs/handbook/tsconfig-json.html

src/backends/webgpu/tslint.json, line 2 at r1 (raw file):

{
  "extends": ["tslint-no-circular-imports"],

same here, share the configuration with the parent tslint: https://palantir.github.io/tslint/2016/03/31/sharable-configurations-rules.html

src/backends/webgpu/benchmarks/tsconfig.json, line 2 at r2 (raw file):

{
  "compilerOptions": {

let's remove the benchmarks folder for now. have the benchmark as xit in the regular unit tests.

src/backends/webgpu/src/index.ts, line 18 at r2 (raw file):

 */

import * as tf from '@tensorflow/tfjs';

depend on tfjs-core instead

src/backends/webgpu/src/mul_test.ts, line 24 at r2 (raw file):

describe('WebGPU backend', () => {
  it('A * B elementwise', async () => {
    await tf.ready;

beforeAll(() => await tf.ready())

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 81 at r1 (raw file):

            for (uint k=0; k<TileSize; k++) {
              acc += Asub[row][k] * Bsub[k][col];

is there a vec4, SIMD stuff in glsl 450?

src/backends/webgpu/src/kernels/multiply_webgpu.ts, line 24 at r1 (raw file):

  outputShape: number[];
  userCode: string;
  dispatch: number[];

add a comment what dispatch means

src/backends/webgpu/src/kernels/webgpu_math.ts, line 30 at r1 (raw file):

export const compileProgram =
    (shaderCompiler: any, shaderKind: any, compileOptions: any, device: any,

hardcode shaderKind to shaderc.shader_kind.compute instead of exposing it as param. Same with compileOptions, bindings etc (we can expose later if we need control in the webgpu backend itself

src/backends/webgpu/src/kernels/webgpu_math.ts, line 37 at r1 (raw file):

      const error = result.GetErrorMessage();
      if (error.length) {
        console.error(error);

return console.error() to early terminate

src/backends/webgpu/src/kernels/webgpu_math.ts, line 39 at r1 (raw file):

        console.error(error);
      }
      const code = result.GetBinary().slice(0).buffer;

is slicing the result a that potential bottleneck here, since it does memcpy on typedarrays, cc @kainino0x cc @Kangz . Do we have to slice it since the memory lives on the wasm heap and can be overwritten?

Kangz

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @kainino0x, @Kangz, and @nsthorat)

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 81 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

is there a vec4, SIMD stuff in glsl 450?

Yes but virtually all GPUs use scalar ALUs now so we don't need to optimize using variable SIMD.

(except Intel for vertex shaders, but this is a compute shader and would use scalar ALUs).

src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

Previously, annxingyuan (Ann Yuan) wrote…

Are you referring to the name of this file being webgpu_math? This is to mimic our webgl backend setup.

Yes I was referring to the name.

src/backends/webgpu/src/kernels/webgpu_math.ts, line 30 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

hardcode shaderKind to shaderc.shader_kind.compute instead of exposing it as param. Same with compileOptions, bindings etc (we can expose later if we need control in the webgpu backend itself

bindings is actually important because it depends on the number of inputs and outputs (and whether they are uniform or storage)

src/backends/webgpu/src/kernels/webgpu_math.ts, line 39 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

is slicing the result a that potential bottleneck here, since it does memcpy on typedarrays, cc @kainino0x cc @Kangz . Do we have to slice it since the memory lives on the wasm heap and can be overwritten?

I think this is needed currently because our WebGPU implementation takes an ArrayBuffer. In the future I think we'd take a view for GPUShaderModuleDescriptor.code. @kainino0x would know this best.

nsthorat

Reviewed 18 of 21 files at r1, 5 of 5 files at r2.
Reviewable status: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @kainino0x, @Kangz, and @nsthorat)

src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

#!/bin/bash

should this be removed?

src/backends/webgpu/karma.conf.js, line 17 at r1 (raw file):

 * =============================================================================
 */

this is fine for now but I think we'll want to merge this with the root karma conf soon

src/backends/webgpu/package.json, line 35 at r1 (raw file):

  },
  "dependencies": {
    "@tensorflow/tfjs": "1.0.4",

this should be core only

src/backends/webgpu/rollup.config.js, line 3 at r1 (raw file):

/**
 * @license
 * Copyright 2018 Google LLC. All Rights Reserved.

2019

src/backends/webgpu/rollup.config.js, line 38 at r1 (raw file):

        include: 'node_modules/**',
        namedExports: {
          './node_modules/seedrandom/index.js': ['alea'],

i dont think you need the seedrandom & crypto stuff

src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 1 at r2 (raw file):

import * as tf from '@tensorflow/tfjs-webgpu';

license

src/backends/webgpu/src/backend_webgpu.ts, line 30 at r2 (raw file):

  values: Float32Array|Int32Array|Uint8Array,
  id: number,
  buffer?: any,  // WebGPUBuffer

can you type these?

src/backends/webgpu/src/backend_webgpu.ts, line 33 at r2 (raw file):

};

interface DataId {}

semicolon

src/backends/webgpu/src/backend_webgpu.ts, line 35 at r2 (raw file):

interface DataId {}

declare const GPUBufferUsage: any;

why declare? can you type these?

src/backends/webgpu/src/backend_webgpu.ts, line 39 at r2 (raw file):

export class WebGPUBackend extends KernelBackend {
  device: any;

can you type these?

src/backends/webgpu/src/backend_webgpu.ts, line 47 at r2 (raw file):

  private binaryCache: {[key: string]: WebGPUBinary};

  constructor(device: any, shaderc: any) {

can you type these?

src/backends/webgpu/src/backend_webgpu.ts, line 79 at r2 (raw file):

  }

  private setBufferData(buffer: any, data: Float32Array|Int32Array|Uint8Array) {

type buffer

src/backends/webgpu/src/backend_webgpu.ts, line 103 at r2 (raw file):

  }

  async getBufferData(info: TensorInfo): Promise<ArrayBuffer> {

private

src/backends/webgpu/src/backend_webgpu.ts, line 130 at r2 (raw file):

  }

  getAndSavePipeline(key: string, getBinary: () => any) {

private method and type return of getBinary()

src/backends/webgpu/src/backend_webgpu.ts, line 137 at r2 (raw file):

  }

  public compileAndRun(

can you make this private?

src/backends/webgpu/src/backend_webgpu.ts, line 138 at r2 (raw file):

  public compileAndRun(
      program: webgpu_math.WebGPUProgram, inputs: any[], output: Tensor) {

type inputs / input

src/backends/webgpu/src/backend_webgpu.ts, line 205 at r2 (raw file):

    const program = new MatMulProgram(output.shape);
    const mnkb =

what is mnkb lol

src/backends/webgpu/src/mul_test.ts, line 18 at r2 (raw file):

 */

import {expectArraysClose} from '@tensorflow/tfjs-core/dist/test_util';

this is public, just import test_util from tfjs-core

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 28 at r2 (raw file):

    this.outputShape = outputShape;
    const tileSize = 2;
    this.dispatch = [

comment about dispatch here as well

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 52 at r2 (raw file):

      shared float Asub[TileSize][TileSize];
      shared float Bsub[TileSize][TileSize];

trailing blank spaces here

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 57 at r2 (raw file):

        // M is A outer, N is shared, K is B outer
        uint M = MNKB[0], N = MNKB[1], K = MNKB[2], batch = MNKB[3];

remove these blank spaces

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 70 at r2 (raw file):

        for (uint t=0; t<numTiles; t++) {

            // Load one tile of A and B into local memory

can you use 2 indentatino here and below?

src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Yes I was referring to the name.

Yeah I also agree webgpu_program is a better name, we probably should also do that to gpgpu_math too (not necessary now) :)

src/backends/webgpu/src/kernels/webgpu_math.ts, line 37 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

return console.error() to early terminate

also throw an error

src/backends/webgpu/src/kernels/webgpu_math.ts, line 25 at r2 (raw file):

export interface WebGPUBinary {
  bindGroupLayout: any;

can you type these at all?

kainino0x

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)

src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

use an absolute path instead of relative, so it works on all macs

There is no absolute path for this; it must point at a custom build of Chromium (the one provided in the zip file on the original PR).

src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

should this be removed?

Makes sense to remove, but include instructions in a README somewhere about how to actually run the thing. Once we have the changes in Canary (soon) it will be slightly simpler, but still require passing a flag to Canary (which seems really tricky with karma, hence this file).

src/backends/webgpu/src/kernels/webgpu_math.ts, line 39 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

I think this is needed currently because our WebGPU implementation takes an ArrayBuffer. In the future I think we'd take a view for GPUShaderModuleDescriptor.code. @kainino0x would know this best.

We are fixing this in Chrome, but the build that you have requires it.

nkreeger

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)

src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

#!/bin/bash

License here for shell scripts too:
/~https://github.com/tensorflow/tfjs-node/blob/master/scripts/build-npm.sh#L2

src/backends/webgpu/README.md, line 6 at r1 (raw file):

This repo contains a standalone WebGPU

Should we add something about this backend being "experimental" or "under early active development"?

src/backends/webgpu/README.md, line 10 at r1 (raw file):

    $ yarn # to install dependencies
    $ yarn test

Nit: wrap block with triple '`'

src/backends/webgpu/rollup.config.js, line 3 at r1 (raw file):

2019

src/backends/webgpu/tsconfig.json, line 28 at r1 (raw file):

nit: whitespace (I think a few of these files in this PR have this)

src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 1 at r1 (raw file):

import * as tf from '@tensorflow/tfjs-webgpu';

License here.

src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 26 at r1 (raw file):

nit: white space

src/backends/webgpu/benchmarks/karma.conf.js, line 1 at r1 (raw file):

const karmaTypescriptConfig = {

License here:
/~https://github.com/tensorflow/tfjs-core/blob/master/karma.conf.js#L1

annxingyuan

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)

src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, kainino0x (Kai Ninomiya) wrote…

There is no absolute path for this; it must point at a custom build of Chromium (the one provided in the zip file on the original PR).

Done - removed this file - added instructions to README on how to run the test suite.

src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, kainino0x (Kai Ninomiya) wrote…

Makes sense to remove, but include instructions in a README somewhere about how to actually run the thing. Once we have the changes in Canary (soon) it will be slightly simpler, but still require passing a flag to Canary (which seems really tricky with karma, hence this file).

Done

src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

License here for shell scripts too:
/~https://github.com/tensorflow/tfjs-node/blob/master/scripts/build-npm.sh#L2

Done

src/backends/webgpu/karma.conf.js, line 17 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

this is fine for now but I think we'll want to merge this with the root karma conf soon

Sounds good.

src/backends/webgpu/package.json, line 35 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

this should be core only

Done

src/backends/webgpu/README.md, line 6 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

This repo contains a standalone WebGPU
Should we add something about this backend being "experimental" or "under early active development"?

Done

src/backends/webgpu/README.md, line 10 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

    $ yarn # to install dependencies
    $ yarn test
Nit: wrap block with triple '`'

Done

src/backends/webgpu/rollup.config.js, line 3 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

2019

Done

src/backends/webgpu/rollup.config.js, line 3 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

2018
2019

Done

src/backends/webgpu/rollup.config.js, line 38 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

i dont think you need the seedrandom & crypto stuff

Done

src/backends/webgpu/tsconfig.json, line 4 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

instead of repeating all of these settings, use "extends": "../../tsconfig" to extend from the top-level tsconfig. Note that you still need to add include but you can skip compilerOptions: https://www.typescriptlang.org/docs/handbook/tsconfig-json.html

Done

src/backends/webgpu/tsconfig.json, line 28 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

}
nit: whitespace (I think a few of these files in this PR have this)

Done

src/backends/webgpu/tslint.json, line 2 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

same here, share the configuration with the parent tslint: https://palantir.github.io/tslint/2016/03/31/sharable-configurations-rules.html

Done

src/backends/webgpu/benchmarks/karma.conf.js, line 1 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

License here:
/~https://github.com/tensorflow/tfjs-core/blob/master/karma.conf.js#L1

Done

src/backends/webgpu/benchmarks/tsconfig.json, line 2 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

let's remove the benchmarks folder for now. have the benchmark as xit in the regular unit tests.

Done

src/backends/webgpu/src/backend_webgpu.ts, line 80 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

FYI this API might be removed but we'll have a replacement for it.

Sounds good

src/backends/webgpu/src/backend_webgpu.ts, line 156 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Creating bindgroups on the fly will probably never be a bottleneck for tfjs.

Noted

src/backends/webgpu/src/backend_webgpu.ts, line 30 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you type these?

Done

src/backends/webgpu/src/backend_webgpu.ts, line 33 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

semicolon

Done

src/backends/webgpu/src/backend_webgpu.ts, line 35 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

why declare? can you type these?

These are properties that exist on window in the custom build of Chrome that we got from Corentin. What's our recommended way for extending window?

src/backends/webgpu/src/backend_webgpu.ts, line 79 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

type buffer

Done

src/backends/webgpu/src/backend_webgpu.ts, line 103 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

private

Done

src/backends/webgpu/src/backend_webgpu.ts, line 130 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

private method and type return of getBinary()

Done

src/backends/webgpu/src/backend_webgpu.ts, line 137 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you make this private?

Done

src/backends/webgpu/src/backend_webgpu.ts, line 138 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

type inputs / input

Done

src/backends/webgpu/src/backend_webgpu.ts, line 205 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

what is mnkb lol

Done

src/backends/webgpu/src/mul_test.ts, line 24 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

beforeAll(() => await tf.ready())

Done

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 27 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Gains would probably come with bigger tile sizes.

Noted

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 28 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

comment about dispatch here as well

Done

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 52 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

trailing blank spaces here

Done

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 57 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

remove these blank spaces

Done

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 70 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you use 2 indentatino here and below?

Done

src/backends/webgpu/src/kernels/multiply_webgpu.ts, line 24 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

add a comment what dispatch means

Done

Added a comment to the interface.

src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 1 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

License here.

Done

src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 26 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

;
nit: white space

Done

src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 1 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

license

Done

src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

Yeah I also agree webgpu_program is a better name, we probably should also do that to gpgpu_math too (not necessary now) :)

Done

src/backends/webgpu/src/kernels/webgpu_math.ts, line 37 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

also throw an error

Done

nkreeger

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)

src/backends/webgpu/src/backend_webgpu.ts, line 39 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you type these?

I'm going to guess that the spec is still in the air and we might not have types around for these...

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 81 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Yes but virtually all GPUs use scalar ALUs now so we don't need to optimize using variable SIMD.

(except Intel for vertex shaders, but this is a compute shader and would use scalar ALUs).

@dsmilkov You can play around with a SIMT-like instruction by playing with the work-groups (x, y, z).

From my understanding, OpenCL tries to do this by experimenting with batch/work-group sizes during compilation (don't quote me on this!).

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 44 at r3 (raw file):

layout(std430, binding = 2) readonly buffer ssbDimensions

Just curious - have you tried this with a uniform instead of SSBO? I wonder if there is performance gains or losses vs. each - I bet it depends on the driver, etc. We can add a TODO for this if you haven't checked (don't block this PR on this).

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 76 at r3 (raw file):

// memoryBarrierShared();

Is this one still needed? Do we need to match the call below? Did we ever get an answer from the WebGPU team on this?

Perhaps we can keep a doc of these Compute Shader things going - either a README here or internal doc for now.

nsthorat

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)

src/backends/webgpu/src/backend_webgpu.ts, line 39 at r2 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

I'm going to guess that the spec is still in the air and we might not have types around for these...

Ann decided not to do this but in general you can still write an interface at the top to make sure you compile-time don't typo things (this can change easily when the API changes by just right click renaming the field and then all the code then changes so you don't have to do a find and replace)

annxingyuan

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)

src/backends/webgpu/src/index.ts, line 18 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

depend on tfjs-core instead

Done

src/backends/webgpu/src/mul_test.ts, line 18 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

this is public, just import test_util from tfjs-core

Done

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 44 at r3 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

layout(std430, binding = 2) readonly buffer ssbDimensions 
Just curious - have you tried this with a uniform instead of SSBO? I wonder if there is performance gains or losses vs. each - I bet it depends on the driver, etc. We can add a TODO for this if you haven't checked (don't block this PR on this).

Yes - I will do this in the next PR - Kai included a note on which flags to use.

src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 76 at r3 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

// memoryBarrierShared();
Is this one still needed? Do we need to match the call below? Did we ever get an answer from the WebGPU team on this?

Perhaps we can keep a doc of these Compute Shader things going - either a README here or internal doc for now.

Yes, Kai responded here: annxingyuan/tfjs-webgpu#2

I have a doc going - will polish it up and share before our sync tomorrow.

nkreeger

Reviewable status: complete! 3 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)

kainino0x · 2019-04-16T18:49:54Z

src/backends/webgpu/README.md

+
+e.g. in `~/.bash_profile`:
+
+`export CHROME_BIN=$HOME/Documents/PROJECTS/tfjs-core-wrapper/Chromium.app/Contents/MacOS/Chromium`


Did you find a way to pass in the --enable-unsafe-webgpu flag this way?

Yep - /~https://github.com/tensorflow/tfjs-core/pull/1676/files#diff-dfbfae59ecf2ebd486cbf50b4b4ab0e5R44

Aha, awesome.

annxingyuan added 3 commits April 13, 2019 15:48

move repo

c5b4c89

fix build

9066889

exclude webgpu

a8c6b5e

Kangz reviewed Apr 15, 2019

View reviewed changes

annxingyuan requested review from dsmilkov and nsthorat April 15, 2019 09:40

annxingyuan self-assigned this Apr 15, 2019

annxingyuan added 2 commits April 15, 2019 06:54

add customo launcher

259896b

pr comments

5e4354f

annxingyuan commented Apr 15, 2019

View reviewed changes

update reamdes

da61554

dsmilkov approved these changes Apr 15, 2019

View reviewed changes

Kangz reviewed Apr 15, 2019

View reviewed changes

nsthorat approved these changes Apr 15, 2019

View reviewed changes

nsthorat requested a review from nkreeger April 15, 2019 15:33

kainino0x reviewed Apr 15, 2019

View reviewed changes

nkreeger suggested changes Apr 15, 2019

View reviewed changes

annxingyuan added 11 commits April 16, 2019 07:04

pr comments

7995656

remove benchmarks

a7c48c1

comments

8b62eae

pr comments

7a71bc8

types

ff7933f

names

0fe1e88

clean

47abba9

comments

bc2a6bc

rename

0f98f29

error

edd3c58

remove semicolon

20141b7

annxingyuan commented Apr 16, 2019

View reviewed changes

Merge branch 'master' into add_webgpu_backend

ab0a2cb

annxingyuan requested a review from nkreeger April 16, 2019 14:25

nkreeger suggested changes Apr 16, 2019

View reviewed changes

nsthorat reviewed Apr 16, 2019

View reviewed changes

annxingyuan commented Apr 16, 2019

View reviewed changes

nkreeger approved these changes Apr 16, 2019

View reviewed changes

annxingyuan merged commit 9881219 into master Apr 16, 2019

annxingyuan deleted the add_webgpu_backend branch April 16, 2019 17:30

kainino0x reviewed Apr 16, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add WebGPU backend. #1676

Add WebGPU backend. #1676

annxingyuan commented Apr 13, 2019 •

edited

Loading

Kangz left a comment

annxingyuan left a comment

dsmilkov left a comment

Kangz left a comment

nsthorat left a comment

kainino0x left a comment

nkreeger left a comment

annxingyuan left a comment

nkreeger left a comment

nsthorat left a comment

annxingyuan left a comment

nkreeger left a comment

kainino0x Apr 16, 2019

annxingyuan Apr 16, 2019

kainino0x Apr 16, 2019


		e.g. in `~/.bash_profile`:

		`export CHROME_BIN=$HOME/Documents/PROJECTS/tfjs-core-wrapper/Chromium.app/Contents/MacOS/Chromium`

Add WebGPU backend. #1676

Add WebGPU backend. #1676

Conversation

annxingyuan commented Apr 13, 2019 • edited Loading

Kangz left a comment

Choose a reason for hiding this comment

annxingyuan left a comment

Choose a reason for hiding this comment

dsmilkov left a comment

Choose a reason for hiding this comment

Kangz left a comment

Choose a reason for hiding this comment

nsthorat left a comment

Choose a reason for hiding this comment

kainino0x left a comment

Choose a reason for hiding this comment

nkreeger left a comment

Choose a reason for hiding this comment

annxingyuan left a comment

Choose a reason for hiding this comment

nkreeger left a comment

Choose a reason for hiding this comment

nsthorat left a comment

Choose a reason for hiding this comment

annxingyuan left a comment

Choose a reason for hiding this comment

nkreeger left a comment

Choose a reason for hiding this comment

kainino0x Apr 16, 2019

Choose a reason for hiding this comment

annxingyuan Apr 16, 2019

Choose a reason for hiding this comment

kainino0x Apr 16, 2019

Choose a reason for hiding this comment

annxingyuan commented Apr 13, 2019 •

edited

Loading