Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement remaining tasks in InferenceClient #1539

Closed
12 tasks done
Tracked by #1488
Wauplin opened this issue Jul 3, 2023 · 10 comments
Closed
12 tasks done
Tracked by #1488

Implement remaining tasks in InferenceClient #1539

Wauplin opened this issue Jul 3, 2023 · 10 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@Wauplin
Copy link
Contributor

Wauplin commented Jul 3, 2023

InferenceClient is a new user-friendly client to deal with inference on HF products. It can seamlessly interact with both the Inference API, the free service to quickly discover and evaluate hosted models, and Inference Endpoints, the paid product for production-ready cloud-based inference.

The InferenceClient (see #1510) offers a simple API with 1 method per task. It handles input/output serialization out of the box to make it easy to use. Similarly, AsyncInferenceClient provides the same interface but for async calls (see #1524). In the initial implementation, only a subset of tasks have been implemented. It is now time to implement the rest!

This issue is meant to centralize the development of the other tasks. The idea is to create 1 PR for each new task, depending on the demand on each task. Here is a good example from @dulayjm who implemented the zero_shot_image_classification task.

Requirements for each task:

  • Implement method in _client.py
    • inputs and output must be type annotated to help documentation
    • usually comes down to a self.post call with correct parameter + deserialization
  • Add docstring including sections description, args, returns, raises and example.
  • Add a test in ./tests/test_inference_client.py
  • Generate test record with pytest tests/test_inference_client.py -k <test_name> --vcr-record=all => generates a YAML file to mock HTTP call to avoid using the InferenceAPI prod for CI tests (see comment)
  • Update table in inference.md
  • Run make style => will format code correctly AND generate the asyncio version of the code out of the box
  • Run make quality => make sure the code is correctly formatted/typed => don't hesitate to ask for help if needed

Remaining tasks:

@Wauplin Wauplin added good first issue Good for newcomers enhancement New feature or request labels Jul 3, 2023
@dulayjm
Copy link
Contributor

dulayjm commented Jul 3, 2023

Well-written issue! I'll take a look and see if I can implement a few others potentially.

@martinbrose
Copy link
Contributor

Hi @dulayjm / @Wauplin,

I ended up going through the list and doing all of the remaining tasks as little challenges.

Thanks @dulayjm for providing a good template with your object detection PR. I hope you weren't working on any of the remaining ones. I'm sorry if you did...

@dulayjm
Copy link
Contributor

dulayjm commented Aug 24, 2023

Thank you for writing them!!

@martinbrose
Copy link
Contributor

@Wauplin, many thanks for all the PR reviews and the review points provided.
It's been a great learning experience.

You've been on 🔥 after your holiday!

@Wauplin
Copy link
Contributor Author

Wauplin commented Sep 7, 2023

Marathon is now over!🏃😴

Thanks @martinbrose @dulayjm for the massive community work on this issue 🤗 I never thought it would be closed that quickly. We now support all main tasks available on the Hub which is a very nice improvement 🔥 🚀

@Wauplin Wauplin closed this as completed Sep 7, 2023
@martinbrose
Copy link
Contributor

@Wauplin, there is one more thing I would like to discuss:

When looking at the tasks that don't have inference models readily available and a model needs to start up, like tabular classification, I ran into some problems in my Jupyter Notebook playground.

I believe the implementation is sending API requests in quick succession while the model is starting up, and the user gets rate-limited super quickly.

Should that be mentioned in the doc string, and advice added on how to avoid this from happening?

@Wauplin
Copy link
Contributor Author

Wauplin commented Sep 7, 2023

Hmm, that's a good question. Have you been rate-limited yourself? In general rate limits happen quickly if the user is not logged in (e.g. not registered with a token). What we could do is to catch HTTP 429 errors here and here and if:

  1. a HTTP 429 happen
  2. and no token in the header
  3. and uses InferenceAPI (e.g. url starts with "https://api-inference.huggingface.com")

=> Then we raise append a message to the raised error saying "hey, you should use a token".

This way it would be seen much more easily by user (than completing the docstring). WDYT?

Also maybe we should add a warning on tabular-classification/tabular-regression that they are not working on InferenceAPI at the moment (which is not so problematic given how little usage we have for tabular data on the Hub).

@martinbrose
Copy link
Contributor

Yeah, I think some sort of error-catching and warning might be needed. I'll have a look at the above links tonight.

I was logged in with a token and even signed up for Pro for a few days. In all cases, I got rate-limited myself very quick.
I haven't had the time to understand exactly why it's happening. I did add the additional parameter "wait_for_model": True", but that didn't seem to help much either.

@Wauplin
Copy link
Contributor Author

Wauplin commented Sep 7, 2023

Hmm interesting feedback, I didn't know about that. Let's open a new issue then to address it specifically. Would you mind doing it and report your use case + error?

For what I understand, in the worst case scenario InferenceClient should make 1 call per second which shouldn't lead rate-limits too quickly. I saw the "wait_for_model": True but to be honest this is a bit outdated now that we've set a retry mechanism in InferenceClient directly (or at least I thought so). I'll investigate further when I have time.

@martinbrose
Copy link
Contributor

Sure, no problem.
Will attend to this over the weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants