Skip to content
This repository has been archived by the owner on Aug 15, 2019. It is now read-only.

Add WebGPU backend. #1676

Merged
merged 18 commits into from
Apr 16, 2019
Merged

Add WebGPU backend. #1676

merged 18 commits into from
Apr 16, 2019

Conversation

annxingyuan
Copy link
Collaborator

@annxingyuan annxingyuan commented Apr 13, 2019

This PR copies the contents of this repo into src/backends/webgpu.

Also excludes src/backends/webgpu from tsconfig.json and karma.conf.js. I confirmed that the rollup build does not include the WebGPU backend.

Full discussion here (the original PR).


This change is Reviewable

Copy link

@Kangz Kangz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan)


src/backends/webgpu/chromium.sh, line 2 at r1 (raw file):

#!/bin/bash
exec ./../../../../Chromium.app/Contents/MacOS/Chromium --enable-unsafe-webgpu "$@"

nit: this seems hardcoded for your configuration. Maybe have an env variable to override it?


src/backends/webgpu/README.md, line 6 at r1 (raw file):

# TensorFlow.js WebGPU backend

This repo contains a standalone WebGPU backend that implements the TensorFlow.js `KernelBackend` interface.

These comments seem outdated because this PR is in the main tfjs-core.


src/backends/webgpu/src/backend_webgpu.ts, line 80 at r1 (raw file):

  private setBufferData(buffer: any, data: Float32Array|Int32Array|Uint8Array) {
    buffer.setSubData(0, data.slice().buffer);

FYI this API might be removed but we'll have a replacement for it.


src/backends/webgpu/src/backend_webgpu.ts, line 118 at r1 (raw file):

    const data = mapped.slice(0);
    info.buffer.unmap();

info.buffer is not mappable so this isn't needed.


src/backends/webgpu/src/backend_webgpu.ts, line 156 at r1 (raw file):

    });

    // TODO: bind groups should be cached.

Creating bindgroups on the fly will probably never be a bottleneck for tfjs.


src/backends/webgpu/src/backend_webgpu.ts, line 179 at r1 (raw file):

    pass.dispatch(...program.dispatch);
    pass.endPass();
    this.queue.submit([encoder.finish()]);

Ideally we are able to do a single submit for everything. Submits tend to be expensive operations.


src/backends/webgpu/src/backend_webgpu.ts, line 209 at r1 (raw file):

    const mnkb =
        tensor1d([outerShapeA, sharedDim, outerShapeB, batch], 'int32');
    // TODO: dispose mnkb

You can use GPUBuffer.destroy to dispose of the data.


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 27 at r1 (raw file):

  constructor(outputShape: [number, number, number]) {
    this.outputShape = outputShape;
    const tileSize = 2;

Gains would probably come with bigger tile sizes.


src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

/**

nit: webgpu_shader.ts / webgpu_program.ts?

@annxingyuan annxingyuan self-assigned this Apr 15, 2019
Copy link
Collaborator Author

@annxingyuan annxingyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, and @nsthorat)


src/backends/webgpu/chromium.sh, line 2 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

nit: this seems hardcoded for your configuration. Maybe have an env variable to override it?

Done


src/backends/webgpu/README.md, line 6 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

These comments seem outdated because this PR is in the main tfjs-core.

Done


src/backends/webgpu/src/backend_webgpu.ts, line 118 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

info.buffer is not mappable so this isn't needed.

Done


src/backends/webgpu/src/backend_webgpu.ts, line 179 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Ideally we are able to do a single submit for everything. Submits tend to be expensive operations.

Yes - will add a toggle for retained mode in future PR.


src/backends/webgpu/src/backend_webgpu.ts, line 209 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

You can use GPUBuffer.destroy to dispose of the data.

Got it.


src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

nit: webgpu_shader.ts / webgpu_program.ts?

Are you referring to the name of this file being webgpu_math? This is to mimic our webgl backend setup.

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few minor comments.

Reviewed 13 of 21 files at r1, 1 of 5 files at r2.
Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, and @nsthorat)


src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

#!/bin/bash

use an absolute path instead of relative, so it works on all macs


src/backends/webgpu/tsconfig.json, line 4 at r1 (raw file):

  "compilerOptions": {
    "module": "commonjs",
    "moduleResolution": "node",

instead of repeating all of these settings, use "extends": "../../tsconfig" to extend from the top-level tsconfig. Note that you still need to add include but you can skip compilerOptions: https://www.typescriptlang.org/docs/handbook/tsconfig-json.html


src/backends/webgpu/tslint.json, line 2 at r1 (raw file):

{
  "extends": ["tslint-no-circular-imports"],

same here, share the configuration with the parent tslint: https://palantir.github.io/tslint/2016/03/31/sharable-configurations-rules.html


src/backends/webgpu/benchmarks/tsconfig.json, line 2 at r2 (raw file):

{
  "compilerOptions": {

let's remove the benchmarks folder for now. have the benchmark as xit in the regular unit tests.


src/backends/webgpu/src/index.ts, line 18 at r2 (raw file):

 */

import * as tf from '@tensorflow/tfjs';

depend on tfjs-core instead


src/backends/webgpu/src/mul_test.ts, line 24 at r2 (raw file):

describe('WebGPU backend', () => {
  it('A * B elementwise', async () => {
    await tf.ready;

beforeAll(() => await tf.ready())


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 81 at r1 (raw file):

            for (uint k=0; k<TileSize; k++) {
              acc += Asub[row][k] * Bsub[k][col];

is there a vec4, SIMD stuff in glsl 450?


src/backends/webgpu/src/kernels/multiply_webgpu.ts, line 24 at r1 (raw file):

  outputShape: number[];
  userCode: string;
  dispatch: number[];

add a comment what dispatch means


src/backends/webgpu/src/kernels/webgpu_math.ts, line 30 at r1 (raw file):

export const compileProgram =
    (shaderCompiler: any, shaderKind: any, compileOptions: any, device: any,

hardcode shaderKind to shaderc.shader_kind.compute instead of exposing it as param. Same with compileOptions, bindings etc (we can expose later if we need control in the webgpu backend itself


src/backends/webgpu/src/kernels/webgpu_math.ts, line 37 at r1 (raw file):

      const error = result.GetErrorMessage();
      if (error.length) {
        console.error(error);

return console.error() to early terminate


src/backends/webgpu/src/kernels/webgpu_math.ts, line 39 at r1 (raw file):

        console.error(error);
      }
      const code = result.GetBinary().slice(0).buffer;

is slicing the result a that potential bottleneck here, since it does memcpy on typedarrays, cc @kainino0x cc @Kangz . Do we have to slice it since the memory lives on the wasm heap and can be overwritten?

Copy link

@Kangz Kangz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @kainino0x, @Kangz, and @nsthorat)


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 81 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

is there a vec4, SIMD stuff in glsl 450?

Yes but virtually all GPUs use scalar ALUs now so we don't need to optimize using variable SIMD.

(except Intel for vertex shaders, but this is a compute shader and would use scalar ALUs).


src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

Previously, annxingyuan (Ann Yuan) wrote…

Are you referring to the name of this file being webgpu_math? This is to mimic our webgl backend setup.

Yes I was referring to the name.


src/backends/webgpu/src/kernels/webgpu_math.ts, line 30 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

hardcode shaderKind to shaderc.shader_kind.compute instead of exposing it as param. Same with compileOptions, bindings etc (we can expose later if we need control in the webgpu backend itself

bindings is actually important because it depends on the number of inputs and outputs (and whether they are uniform or storage)


src/backends/webgpu/src/kernels/webgpu_math.ts, line 39 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

is slicing the result a that potential bottleneck here, since it does memcpy on typedarrays, cc @kainino0x cc @Kangz . Do we have to slice it since the memory lives on the wasm heap and can be overwritten?

I think this is needed currently because our WebGPU implementation takes an ArrayBuffer. In the future I think we'd take a view for GPUShaderModuleDescriptor.code. @kainino0x would know this best.

Copy link
Contributor

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 18 of 21 files at r1, 5 of 5 files at r2.
Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @kainino0x, @Kangz, and @nsthorat)


src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

#!/bin/bash

should this be removed?


src/backends/webgpu/karma.conf.js, line 17 at r1 (raw file):

 * =============================================================================
 */

this is fine for now but I think we'll want to merge this with the root karma conf soon


src/backends/webgpu/package.json, line 35 at r1 (raw file):

  },
  "dependencies": {
    "@tensorflow/tfjs": "1.0.4",

this should be core only


src/backends/webgpu/rollup.config.js, line 3 at r1 (raw file):

/**
 * @license
 * Copyright 2018 Google LLC. All Rights Reserved.

2019


src/backends/webgpu/rollup.config.js, line 38 at r1 (raw file):

        include: 'node_modules/**',
        namedExports: {
          './node_modules/seedrandom/index.js': ['alea'],

i dont think you need the seedrandom & crypto stuff


src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 1 at r2 (raw file):

import * as tf from '@tensorflow/tfjs-webgpu';

license


src/backends/webgpu/src/backend_webgpu.ts, line 30 at r2 (raw file):

  values: Float32Array|Int32Array|Uint8Array,
  id: number,
  buffer?: any,  // WebGPUBuffer

can you type these?


src/backends/webgpu/src/backend_webgpu.ts, line 33 at r2 (raw file):

};

interface DataId {}

semicolon


src/backends/webgpu/src/backend_webgpu.ts, line 35 at r2 (raw file):

interface DataId {}

declare const GPUBufferUsage: any;

why declare? can you type these?


src/backends/webgpu/src/backend_webgpu.ts, line 39 at r2 (raw file):

export class WebGPUBackend extends KernelBackend {
  device: any;

can you type these?


src/backends/webgpu/src/backend_webgpu.ts, line 47 at r2 (raw file):

  private binaryCache: {[key: string]: WebGPUBinary};

  constructor(device: any, shaderc: any) {

can you type these?


src/backends/webgpu/src/backend_webgpu.ts, line 79 at r2 (raw file):

  }

  private setBufferData(buffer: any, data: Float32Array|Int32Array|Uint8Array) {

type buffer


src/backends/webgpu/src/backend_webgpu.ts, line 103 at r2 (raw file):

  }

  async getBufferData(info: TensorInfo): Promise<ArrayBuffer> {

private


src/backends/webgpu/src/backend_webgpu.ts, line 130 at r2 (raw file):

  }

  getAndSavePipeline(key: string, getBinary: () => any) {

private method and type return of getBinary()


src/backends/webgpu/src/backend_webgpu.ts, line 137 at r2 (raw file):

  }

  public compileAndRun(

can you make this private?


src/backends/webgpu/src/backend_webgpu.ts, line 138 at r2 (raw file):

  public compileAndRun(
      program: webgpu_math.WebGPUProgram, inputs: any[], output: Tensor) {

type inputs / input


src/backends/webgpu/src/backend_webgpu.ts, line 205 at r2 (raw file):

    const program = new MatMulProgram(output.shape);
    const mnkb =

what is mnkb lol


src/backends/webgpu/src/mul_test.ts, line 18 at r2 (raw file):

 */

import {expectArraysClose} from '@tensorflow/tfjs-core/dist/test_util';

this is public, just import test_util from tfjs-core


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 28 at r2 (raw file):

    this.outputShape = outputShape;
    const tileSize = 2;
    this.dispatch = [

comment about dispatch here as well


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 52 at r2 (raw file):

      shared float Asub[TileSize][TileSize];
      shared float Bsub[TileSize][TileSize];  

trailing blank spaces here


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 57 at r2 (raw file):

        // M is A outer, N is shared, K is B outer
        uint M = MNKB[0], N = MNKB[1], K = MNKB[2], batch = MNKB[3];
        

remove these blank spaces


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 70 at r2 (raw file):

        for (uint t=0; t<numTiles; t++) {

            // Load one tile of A and B into local memory

can you use 2 indentatino here and below?


src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Yes I was referring to the name.

Yeah I also agree webgpu_program is a better name, we probably should also do that to gpgpu_math too (not necessary now) :)


src/backends/webgpu/src/kernels/webgpu_math.ts, line 37 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

return console.error() to early terminate

also throw an error


src/backends/webgpu/src/kernels/webgpu_math.ts, line 25 at r2 (raw file):

export interface WebGPUBinary {
  bindGroupLayout: any;

can you type these at all?

@nsthorat nsthorat requested a review from nkreeger April 15, 2019 15:33
Copy link
Contributor

@kainino0x kainino0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)


src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

use an absolute path instead of relative, so it works on all macs

There is no absolute path for this; it must point at a custom build of Chromium (the one provided in the zip file on the original PR).


src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

should this be removed?

Makes sense to remove, but include instructions in a README somewhere about how to actually run the thing. Once we have the changes in Canary (soon) it will be slightly simpler, but still require passing a flag to Canary (which seems really tricky with karma, hence this file).


src/backends/webgpu/src/kernels/webgpu_math.ts, line 39 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

I think this is needed currently because our WebGPU implementation takes an ArrayBuffer. In the future I think we'd take a view for GPUShaderModuleDescriptor.code. @kainino0x would know this best.

We are fixing this in Chrome, but the build that you have requires it.

Copy link
Contributor

@nkreeger nkreeger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)


src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

#!/bin/bash

License here for shell scripts too:
/~https://github.com/tensorflow/tfjs-node/blob/master/scripts/build-npm.sh#L2


src/backends/webgpu/README.md, line 6 at r1 (raw file):

This repo contains a standalone WebGPU

Should we add something about this backend being "experimental" or "under early active development"?


src/backends/webgpu/README.md, line 10 at r1 (raw file):

    $ yarn # to install dependencies
    $ yarn test

Nit: wrap block with triple '`'


src/backends/webgpu/rollup.config.js, line 3 at r1 (raw file):

2018

2019


src/backends/webgpu/tsconfig.json, line 28 at r1 (raw file):

}

nit: whitespace (I think a few of these files in this PR have this)


src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 1 at r1 (raw file):

import * as tf from '@tensorflow/tfjs-webgpu';

License here.


src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 26 at r1 (raw file):

;

nit: white space


src/backends/webgpu/benchmarks/karma.conf.js, line 1 at r1 (raw file):

const karmaTypescriptConfig = {

License here:
/~https://github.com/tensorflow/tfjs-core/blob/master/karma.conf.js#L1

Copy link
Collaborator Author

@annxingyuan annxingyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)


src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, kainino0x (Kai Ninomiya) wrote…

There is no absolute path for this; it must point at a custom build of Chromium (the one provided in the zip file on the original PR).

Done - removed this file - added instructions to README on how to run the test suite.


src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, kainino0x (Kai Ninomiya) wrote…

Makes sense to remove, but include instructions in a README somewhere about how to actually run the thing. Once we have the changes in Canary (soon) it will be slightly simpler, but still require passing a flag to Canary (which seems really tricky with karma, hence this file).

Done


src/backends/webgpu/chromium.sh, line 1 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

License here for shell scripts too:
/~https://github.com/tensorflow/tfjs-node/blob/master/scripts/build-npm.sh#L2

Done


src/backends/webgpu/karma.conf.js, line 17 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

this is fine for now but I think we'll want to merge this with the root karma conf soon

Sounds good.


src/backends/webgpu/package.json, line 35 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

this should be core only

Done


src/backends/webgpu/README.md, line 6 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…
This repo contains a standalone WebGPU

Should we add something about this backend being "experimental" or "under early active development"?

Done


src/backends/webgpu/README.md, line 10 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…
    $ yarn # to install dependencies
    $ yarn test

Nit: wrap block with triple '`'

Done


src/backends/webgpu/rollup.config.js, line 3 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

2019

Done


src/backends/webgpu/rollup.config.js, line 3 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…
2018

2019

Done


src/backends/webgpu/rollup.config.js, line 38 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

i dont think you need the seedrandom & crypto stuff

Done


src/backends/webgpu/tsconfig.json, line 4 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

instead of repeating all of these settings, use "extends": "../../tsconfig" to extend from the top-level tsconfig. Note that you still need to add include but you can skip compilerOptions: https://www.typescriptlang.org/docs/handbook/tsconfig-json.html

Done


src/backends/webgpu/tsconfig.json, line 28 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…
}

nit: whitespace (I think a few of these files in this PR have this)

Done


src/backends/webgpu/tslint.json, line 2 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

same here, share the configuration with the parent tslint: https://palantir.github.io/tslint/2016/03/31/sharable-configurations-rules.html

Done


src/backends/webgpu/benchmarks/karma.conf.js, line 1 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

License here:
/~https://github.com/tensorflow/tfjs-core/blob/master/karma.conf.js#L1

Done


src/backends/webgpu/benchmarks/tsconfig.json, line 2 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

let's remove the benchmarks folder for now. have the benchmark as xit in the regular unit tests.

Done


src/backends/webgpu/src/backend_webgpu.ts, line 80 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

FYI this API might be removed but we'll have a replacement for it.

Sounds good


src/backends/webgpu/src/backend_webgpu.ts, line 156 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Creating bindgroups on the fly will probably never be a bottleneck for tfjs.

Noted


src/backends/webgpu/src/backend_webgpu.ts, line 30 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you type these?

Done


src/backends/webgpu/src/backend_webgpu.ts, line 33 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

semicolon

Done


src/backends/webgpu/src/backend_webgpu.ts, line 35 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

why declare? can you type these?

These are properties that exist on window in the custom build of Chrome that we got from Corentin. What's our recommended way for extending window?


src/backends/webgpu/src/backend_webgpu.ts, line 79 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

type buffer

Done


src/backends/webgpu/src/backend_webgpu.ts, line 103 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

private

Done


src/backends/webgpu/src/backend_webgpu.ts, line 130 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

private method and type return of getBinary()

Done


src/backends/webgpu/src/backend_webgpu.ts, line 137 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you make this private?

Done


src/backends/webgpu/src/backend_webgpu.ts, line 138 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

type inputs / input

Done


src/backends/webgpu/src/backend_webgpu.ts, line 205 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

what is mnkb lol

Done


src/backends/webgpu/src/mul_test.ts, line 24 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

beforeAll(() => await tf.ready())

Done


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 27 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Gains would probably come with bigger tile sizes.

Noted


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 28 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

comment about dispatch here as well

Done


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 52 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

trailing blank spaces here

Done


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 57 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

remove these blank spaces

Done


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 70 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you use 2 indentatino here and below?

Done


src/backends/webgpu/src/kernels/multiply_webgpu.ts, line 24 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

add a comment what dispatch means

Done

Added a comment to the interface.


src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 1 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

License here.

Done


src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 26 at r1 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…
;

nit: white space

Done


src/backends/webgpu/benchmarks/benchmark_ops_test.ts, line 1 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

license

Done


src/backends/webgpu/src/kernels/webgpu_math.ts, line 1 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

Yeah I also agree webgpu_program is a better name, we probably should also do that to gpgpu_math too (not necessary now) :)

Done


src/backends/webgpu/src/kernels/webgpu_math.ts, line 37 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

also throw an error

Done

@annxingyuan annxingyuan requested a review from nkreeger April 16, 2019 14:25
Copy link
Contributor

@nkreeger nkreeger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)


src/backends/webgpu/src/backend_webgpu.ts, line 39 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you type these?

I'm going to guess that the spec is still in the air and we might not have types around for these...


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 81 at r1 (raw file):

Previously, Kangz (Corentin Wallez) wrote…

Yes but virtually all GPUs use scalar ALUs now so we don't need to optimize using variable SIMD.

(except Intel for vertex shaders, but this is a compute shader and would use scalar ALUs).

@dsmilkov You can play around with a SIMT-like instruction by playing with the work-groups (x, y, z).

From my understanding, OpenCL tries to do this by experimenting with batch/work-group sizes during compilation (don't quote me on this!).


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 44 at r3 (raw file):

layout(std430, binding = 2) readonly buffer ssbDimensions 

Just curious - have you tried this with a uniform instead of SSBO? I wonder if there is performance gains or losses vs. each - I bet it depends on the driver, etc. We can add a TODO for this if you haven't checked (don't block this PR on this).


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 76 at r3 (raw file):

// memoryBarrierShared();

Is this one still needed? Do we need to match the call below? Did we ever get an answer from the WebGPU team on this?

Perhaps we can keep a doc of these Compute Shader things going - either a README here or internal doc for now.

Copy link
Contributor

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)


src/backends/webgpu/src/backend_webgpu.ts, line 39 at r2 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…

I'm going to guess that the spec is still in the air and we might not have types around for these...

Ann decided not to do this but in general you can still write an interface at the top to make sure you compile-time don't typo things (this can change easily when the API changes by just right click renaming the field and then all the code then changes so you don't have to do a find and replace)

Copy link
Collaborator Author

@annxingyuan annxingyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)


src/backends/webgpu/src/index.ts, line 18 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

depend on tfjs-core instead

Done


src/backends/webgpu/src/mul_test.ts, line 18 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

this is public, just import test_util from tfjs-core

Done


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 44 at r3 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…
layout(std430, binding = 2) readonly buffer ssbDimensions 

Just curious - have you tried this with a uniform instead of SSBO? I wonder if there is performance gains or losses vs. each - I bet it depends on the driver, etc. We can add a TODO for this if you haven't checked (don't block this PR on this).

Yes - I will do this in the next PR - Kai included a note on which flags to use.


src/backends/webgpu/src/kernels/matmul_webgpu.ts, line 76 at r3 (raw file):

Previously, nkreeger (Nick Kreeger) wrote…
// memoryBarrierShared();

Is this one still needed? Do we need to match the call below? Did we ever get an answer from the WebGPU team on this?

Perhaps we can keep a doc of these Compute Shader things going - either a README here or internal doc for now.

Yes, Kai responded here: annxingyuan/tfjs-webgpu#2

I have a doc going - will polish it up and share before our sync tomorrow.

Copy link
Contributor

@nkreeger nkreeger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 3 of 1 approvals obtained (waiting on @annxingyuan, @dsmilkov, @Kangz, @nkreeger, and @nsthorat)

@annxingyuan annxingyuan merged commit 9881219 into master Apr 16, 2019
@annxingyuan annxingyuan deleted the add_webgpu_backend branch April 16, 2019 17:30

e.g. in `~/.bash_profile`:

`export CHROME_BIN=$HOME/Documents/PROJECTS/tfjs-core-wrapper/Chromium.app/Contents/MacOS/Chromium`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you find a way to pass in the --enable-unsafe-webgpu flag this way?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, awesome.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants