Skip to content
This repository has been archived by the owner on Aug 15, 2019. It is now read-only.

Add tf.io.browserHTTPRequest #1030

Merged
merged 9 commits into from
May 12, 2018
Merged

Conversation

caisq
Copy link
Collaborator

@caisq caisq commented May 9, 2018

  • Allows model artifacts to be sent via a multipart/form-data
    HTTP request

Towards: tensorflow/tfjs#13


This change is Reviewable

@davidsoergel
Copy link
Member

Here's a first set of comments.

Broadly I'm confused about the expected usage pattern for this and IOHandler generally (sorry I missed the prior reviews). It seems unlikely to me that users want to specify a location once and then write to it repeatedly (overwriting, presumably). But the API seems specifically designed for that use case, and the code requires some extra complexity to handle it safely. What am I missing?


Review status: 0 of 3 files reviewed at latest revision, all discussions resolved.


src/io/browser_http.ts, line 30 at r1 (raw file):

import {IOHandler, ModelArtifacts, SaveResult, WeightsManifestConfig} from './types';

export class BrowserHTTPRequest implements IOHandler {

Does this need to be exported, given that users should use the factory function anyway?


src/io/browser_http.ts, line 53 at r1 (raw file):

    if (modelArtifacts.modelTopology instanceof ArrayBuffer) {
      throw new Error(
          'HTTPReqwuests.save() does not support saving model topology ' +

typo, but also I think you meant BrowserHTTPRequest.save()


src/io/browser_http.ts, line 57 at r1 (raw file):

    }

    return new Promise<SaveResult>(async (resolve, reject) => {

Unless I'm misunderstanding something, this is adding an unnecessary layer of Promise, and mixing async/await syntax with callback syntax is confusing in any case. I think you can just delete this line and make the corresponding minor tweaks below.


src/io/browser_http.ts, line 58 at r1 (raw file):

    return new Promise<SaveResult>(async (resolve, reject) => {
      const requestInit =

Don't shadow the field name


src/io/browser_http.ts, line 60 at r1 (raw file):

      const requestInit =
          Object.assign({method: this.DEFAULT_METHOD}, this.requestInit);
      if (requestInit.body == null) {

But if the user passed in a body, then each call to save() uses the same one (because Object.assign makes a shallow copy). Thus the formData.append calls below will accumulate artifacts in the same body.


src/io/browser_http.ts, line 73 at r1 (raw file):

      };

      const formData = requestInit.body as FormData;

I think you could avoid the cast here by preparing formData in the constructor and/or in the conditional above where its type is unambiguous.


src/io/browser_http.ts, line 92 at r1 (raw file):

      if (response.status === 200) {
        resolve({

return


src/io/browser_http.ts, line 97 at r1 (raw file):

        });
      } else {
        reject(response);

throw


src/io/browser_http.ts, line 134 at r1 (raw file):

 * The following Python code snippet based on the
 * [flask](/~https://github.com/pallets/flask) server framework implements a
 * server that can receive the request. Upon receiving the model artifacrts

typo


src/io/browser_http.ts, line 233 at r1 (raw file):

BrowserHTTPRequest

The declared return type is IOHandler; the concrete class is an implementation detail that the caller doesn't care about.


Comments from Reviewable

@caisq
Copy link
Collaborator Author

caisq commented May 11, 2018

As David and I discussed offline, this API that involves calling tf.io.browserHTTPRequest is fine in that

  1. most users will call a simpler API that doesn't involve tf.io.browserHTTPRequest, namely model.save('http://my-server/upload-model'). tf.io.browserHTTPRequest` is used only when the user needs to control additional request parameters such as method, headers and credentials.
  2. there are indeed cases in which a user wants to load/save multiple times from the same object as returned by tf.io.browserHTTPRequest.

Review status: 0 of 3 files reviewed at latest revision, 10 unresolved discussions.


src/io/browser_http.ts, line 30 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

Does this need to be exported, given that users should use the factory function anyway?

Good question. It doesn't need to be exported. I fixed it. Did the same fix with other IOHandler subtypes we have so far.


src/io/browser_http.ts, line 53 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

typo, but also I think you meant BrowserHTTPRequest.save()

Done.


src/io/browser_http.ts, line 57 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

Unless I'm misunderstanding something, this is adding an unnecessary layer of Promise, and mixing async/await syntax with callback syntax is confusing in any case. I think you can just delete this line and make the corresponding minor tweaks below.

Done. Good point. Thanks.


src/io/browser_http.ts, line 58 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

Don't shadow the field name

Done.


src/io/browser_http.ts, line 60 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

But if the user passed in a body, then each call to save() uses the same one (because Object.assign makes a shallow copy). Thus the formData.append calls below will accumulate artifacts in the same body.

Good catch. I decided to not allow an existing FormData passed in. The reason is that FormData is not enumerable across all browsers:
https://stackoverflow.com/questions/22409667/how-to-combine-two-javascript-formdata-objects

Chrome has a method FormData.entries() since 2016. But I don't want to do a Chrome-only thing.

I update the code and the doc.


src/io/browser_http.ts, line 73 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

I think you could avoid the cast here by preparing formData in the constructor and/or in the conditional above where its type is unambiguous.

See my comment above.


src/io/browser_http.ts, line 92 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

return

Done.


src/io/browser_http.ts, line 97 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

throw

Done.


src/io/browser_http.ts, line 134 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…

typo

Done.


src/io/browser_http.ts, line 233 at r1 (raw file):

Previously, davidsoergel (David Soergel) wrote…
BrowserHTTPRequest

The declared return type is IOHandler; the concrete class is an implementation detail that the caller doesn't care about.

Fixed the doc.


Comments from Reviewable

@nsthorat
Copy link
Contributor

:lgtm_strong:


Review status: 0 of 6 files reviewed at latest revision, 10 unresolved discussions.


src/io/browser_http.ts, line 102 at r1 (raw file):

  }

  // TODO(cais): Maybe add load to unify this IOHandler type and the mechanism

instead of a TODO, file an issue?


src/io/browser_http.ts, line 128 at r1 (raw file):

 *
 * const saveResult = await model.save(tf.io.browserHTTPRequest(
 *     'http://model-server.domain:5000/upload'));

nit: domain isn't the right name here


Comments from Reviewable

@pyu10055
Copy link
Collaborator

Review status: 0 of 6 files reviewed at latest revision, 12 unresolved discussions, all commit checks successful.


src/io/browser_http.ts, line 50 at r2 (raw file):

  }

  async save(modelArtifacts: ModelArtifacts): Promise<SaveResult> {

should you require the path be of some REST format that take the name of the model?
If not, should the post body contain the name of the model?


src/io/browser_http.ts, line 84 at r2 (raw file):

    }

    const response = await fetch(this.path, init);

this uses browser fetch, it won't be available in node, might be more compatible by using node-fetch directly.


src/io/browser_http.ts, line 128 at r2 (raw file):

 * ```
 *
 * The following Python code snippet based on the

I would think a node server example might be more relevant here.


Comments from Reviewable

@caisq
Copy link
Collaborator Author

caisq commented May 11, 2018

Review status: 0 of 6 files reviewed at latest revision, 15 unresolved discussions, some commit checks pending.


src/io/browser_http.ts, line 57 at r1 (raw file):

Previously, caisq (Shanqing Cai) wrote…

Done. Good point. Thanks.

Now that we don't allow passing in an existing body, the Object.assign here should be fine, because we don't touch any other fields such as headers.


src/io/browser_http.ts, line 102 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

instead of a TODO, file an issue?

Done. tensorflow/tfjs#290


src/io/browser_http.ts, line 128 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

nit: domain isn't the right name here

localhost.localdomain used to be the default DNS name of the local machine for some linux distros. I think domain is fine.
https://lists.debian.org/debian-devel/2005/10/msg00194.html

But indeed that's not necessary. So I removed it.


src/io/browser_http.ts, line 50 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

should you require the path be of some REST format that take the name of the model?
If not, should the post body contain the name of the model?

The design is to leave metadata information such as name, version etc. of the model up to the client. They can send them in headers.

This keeps the API simple. Users have various constraints and requirements that I don't want to anticipate at this point. Let me know if that sounds good to you.


src/io/browser_http.ts, line 84 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

this uses browser fetch, it won't be available in node, might be more compatible by using node-fetch directly.

Right, this is why we call the method browserHTTPReqest. Unfortunately we can't import node-fetch in tfjs-core.

When we implement HTTP for node, we'll do the following in tfjs-node: use the IOHandler interface from core and import node-fetch to implement an IOHandler. Let me know if that sounds good to you.


src/io/browser_http.ts, line 128 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

I would think a node server example might be more relevant here.

We can add a node.js server later. The nice thing about a Python server is like this is that you can directly import keras and reconstitute a keras Model object, ready fir predict and fit calls on the server side. Let me know if that sounds good to you.


Comments from Reviewable

@pyu10055
Copy link
Collaborator

:lgtm_strong:


Review status: 0 of 6 files reviewed at latest revision, 14 unresolved discussions, some commit checks failed.


src/io/browser_http.ts, line 50 at r2 (raw file):

Previously, caisq (Shanqing Cai) wrote…

The design is to leave metadata information such as name, version etc. of the model up to the client. They can send them in headers.

This keeps the API simple. Users have various constraints and requirements that I don't want to anticipate at this point. Let me know if that sounds good to you.

sounds good, still be good to have the metadata (name, version) in the ModelArtifacts, we might need to add a meta file for storage.


src/io/browser_http.ts, line 84 at r2 (raw file):

Previously, caisq (Shanqing Cai) wrote…

Right, this is why we call the method browserHTTPReqest. Unfortunately we can't import node-fetch in tfjs-core.

When we implement HTTP for node, we'll do the following in tfjs-node: use the IOHandler interface from core and import node-fetch to implement an IOHandler. Let me know if that sounds good to you.

sgtm


Comments from Reviewable

@caisq
Copy link
Collaborator Author

caisq commented May 12, 2018

Review status: 0 of 6 files reviewed at latest revision, 14 unresolved discussions, some commit checks pending.


src/io/browser_http.ts, line 50 at r2 (raw file):

metadata

Agreed that metadata is useful. If you look at the existing saving modalities:

  1. For downloaded files, the model name is the prefix of the file names.
  2. For local storage and indexedDB, the model name is the path key.
  3. For HTTP requests (i.e., this PR), it is up to the client of this API to add a name and other metadata for the model, most likely through headers. If we get users asking for it, we can implement it in the library.

Comments from Reviewable

@caisq caisq merged commit 5980684 into tensorflow:master May 12, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants