Implement remaining tasks in `InferenceClient` #1539

Wauplin · 2023-07-03T13:17:44Z

InferenceClient is a new user-friendly client to deal with inference on HF products. It can seamlessly interact with both the Inference API, the free service to quickly discover and evaluate hosted models, and Inference Endpoints, the paid product for production-ready cloud-based inference.

The InferenceClient (see #1510) offers a simple API with 1 method per task. It handles input/output serialization out of the box to make it easy to use. Similarly, AsyncInferenceClient provides the same interface but for async calls (see #1524). In the initial implementation, only a subset of tasks have been implemented. It is now time to implement the rest!

This issue is meant to centralize the development of the other tasks. The idea is to create 1 PR for each new task, depending on the demand on each task. Here is a good example from @dulayjm who implemented the zero_shot_image_classification task.

Requirements for each task:

Implement method in _client.py
- inputs and output must be type annotated to help documentation
- usually comes down to a self.post call with correct parameter + deserialization
Add docstring including sections description, args, returns, raises and example.
Add a test in ./tests/test_inference_client.py
Generate test record with pytest tests/test_inference_client.py -k <test_name> --vcr-record=all => generates a YAML file to mock HTTP call to avoid using the InferenceAPI prod for CI tests (see comment)
Update table in inference.md
Run make style => will format code correctly AND generate the asyncio version of the code out of the box
Run make quality => make sure the code is correctly formatted/typed => don't hesitate to ask for help if needed

Remaining tasks:

The text was updated successfully, but these errors were encountered:

dulayjm · 2023-07-03T14:39:41Z

Well-written issue! I'll take a look and see if I can implement a few others potentially.

martinbrose · 2023-08-23T21:53:14Z

Hi @dulayjm / @Wauplin,

I ended up going through the list and doing all of the remaining tasks as little challenges.

Thanks @dulayjm for providing a good template with your object detection PR. I hope you weren't working on any of the remaining ones. I'm sorry if you did...

dulayjm · 2023-08-24T01:24:32Z

Thank you for writing them!!

martinbrose · 2023-09-07T08:43:11Z

@Wauplin, many thanks for all the PR reviews and the review points provided.
It's been a great learning experience.

You've been on 🔥 after your holiday!

Wauplin · 2023-09-07T08:52:07Z

Marathon is now over!🏃😴

Thanks @martinbrose @dulayjm for the massive community work on this issue 🤗 I never thought it would be closed that quickly. We now support all main tasks available on the Hub which is a very nice improvement 🔥 🚀

martinbrose · 2023-09-07T09:11:16Z

@Wauplin, there is one more thing I would like to discuss:

When looking at the tasks that don't have inference models readily available and a model needs to start up, like tabular classification, I ran into some problems in my Jupyter Notebook playground.

I believe the implementation is sending API requests in quick succession while the model is starting up, and the user gets rate-limited super quickly.

Should that be mentioned in the doc string, and advice added on how to avoid this from happening?

Wauplin · 2023-09-07T09:32:58Z

Hmm, that's a good question. Have you been rate-limited yourself? In general rate limits happen quickly if the user is not logged in (e.g. not registered with a token). What we could do is to catch HTTP 429 errors here and here and if:

a HTTP 429 happen
and no token in the header
and uses InferenceAPI (e.g. url starts with "https://api-inference.huggingface.com")

=> Then we raise append a message to the raised error saying "hey, you should use a token".

This way it would be seen much more easily by user (than completing the docstring). WDYT?

Also maybe we should add a warning on tabular-classification/tabular-regression that they are not working on InferenceAPI at the moment (which is not so problematic given how little usage we have for tabular data on the Hub).

martinbrose · 2023-09-07T10:18:13Z

Yeah, I think some sort of error-catching and warning might be needed. I'll have a look at the above links tonight.

I was logged in with a token and even signed up for Pro for a few days. In all cases, I got rate-limited myself very quick.
I haven't had the time to understand exactly why it's happening. I did add the additional parameter "wait_for_model": True", but that didn't seem to help much either.

Wauplin · 2023-09-07T10:26:06Z

Hmm interesting feedback, I didn't know about that. Let's open a new issue then to address it specifically. Would you mind doing it and report your use case + error?

For what I understand, in the worst case scenario InferenceClient should make 1 call per second which shouldn't lead rate-limits too quickly. I saw the "wait_for_model": True but to be honest this is a bit outdated now that we've set a retry mechanism in InferenceClient directly (or at least I thought so). I'll investigate further when I have time.

martinbrose · 2023-09-07T10:36:15Z

Sure, no problem.
Will attend to this over the weekend.

Wauplin added good first issue Good for newcomers enhancement New feature or request labels Jul 3, 2023

dulayjm mentioned this issue Jul 10, 2023

Add object detection to inference client #1548

Merged

Wauplin mentioned this issue Jul 11, 2023

InferenceClient: next steps #1488

Closed

7 tasks

Wauplin mentioned this issue Sep 7, 2023

Add support for zero shot classification task #1644

Merged

Wauplin closed this as completed Sep 7, 2023

Wauplin mentioned this issue Feb 6, 2024

Support audio_to_audio task in InferenceClient #2016

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement remaining tasks in `InferenceClient` #1539

Implement remaining tasks in `InferenceClient` #1539

Wauplin commented Jul 3, 2023 •

edited

Loading

dulayjm commented Jul 3, 2023

martinbrose commented Aug 23, 2023

dulayjm commented Aug 24, 2023

martinbrose commented Sep 7, 2023

Wauplin commented Sep 7, 2023

martinbrose commented Sep 7, 2023

Wauplin commented Sep 7, 2023 •

edited

Loading

martinbrose commented Sep 7, 2023

Wauplin commented Sep 7, 2023

martinbrose commented Sep 7, 2023

Implement remaining tasks in InferenceClient #1539

Implement remaining tasks in InferenceClient #1539

Comments

Wauplin commented Jul 3, 2023 • edited Loading

dulayjm commented Jul 3, 2023

martinbrose commented Aug 23, 2023

dulayjm commented Aug 24, 2023

martinbrose commented Sep 7, 2023

Wauplin commented Sep 7, 2023

martinbrose commented Sep 7, 2023

Wauplin commented Sep 7, 2023 • edited Loading

martinbrose commented Sep 7, 2023

Wauplin commented Sep 7, 2023

martinbrose commented Sep 7, 2023

Implement remaining tasks in `InferenceClient` #1539

Implement remaining tasks in `InferenceClient` #1539

Wauplin commented Jul 3, 2023 •

edited

Loading

Wauplin commented Sep 7, 2023 •

edited

Loading