-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement remaining tasks in InferenceClient
#1539
Comments
Well-written issue! I'll take a look and see if I can implement a few others potentially. |
Thank you for writing them!! |
@Wauplin, many thanks for all the PR reviews and the review points provided. You've been on 🔥 after your holiday! |
Marathon is now over!🏃😴 Thanks @martinbrose @dulayjm for the massive community work on this issue 🤗 I never thought it would be closed that quickly. We now support all main tasks available on the Hub which is a very nice improvement 🔥 🚀 |
@Wauplin, there is one more thing I would like to discuss: When looking at the tasks that don't have inference models readily available and a model needs to start up, like tabular classification, I ran into some problems in my Jupyter Notebook playground. I believe the implementation is sending API requests in quick succession while the model is starting up, and the user gets rate-limited super quickly. Should that be mentioned in the doc string, and advice added on how to avoid this from happening? |
Hmm, that's a good question. Have you been rate-limited yourself? In general rate limits happen quickly if the user is not logged in (e.g. not registered with a token). What we could do is to catch HTTP 429 errors here and here and if:
=> Then we raise append a message to the raised error saying "hey, you should use a token". This way it would be seen much more easily by user (than completing the docstring). WDYT? Also maybe we should add a warning on tabular-classification/tabular-regression that they are not working on InferenceAPI at the moment (which is not so problematic given how little usage we have for tabular data on the Hub). |
Yeah, I think some sort of error-catching and warning might be needed. I'll have a look at the above links tonight. I was logged in with a token and even signed up for Pro for a few days. In all cases, I got rate-limited myself very quick. |
Hmm interesting feedback, I didn't know about that. Let's open a new issue then to address it specifically. Would you mind doing it and report your use case + error? For what I understand, in the worst case scenario |
Sure, no problem. |
InferenceClient
is a new user-friendly client to deal with inference on HF products. It can seamlessly interact with both the Inference API, the free service to quickly discover and evaluate hosted models, and Inference Endpoints, the paid product for production-ready cloud-based inference.The
InferenceClient
(see #1510) offers a simple API with 1 method per task. It handles input/output serialization out of the box to make it easy to use. Similarly,AsyncInferenceClient
provides the same interface but for async calls (see #1524). In the initial implementation, only a subset of tasks have been implemented. It is now time to implement the rest!This issue is meant to centralize the development of the other tasks. The idea is to create 1 PR for each new task, depending on the demand on each task. Here is a good example from @dulayjm who implemented the
zero_shot_image_classification
task.Requirements for each task:
self.post
call with correct parameter + deserializationpytest tests/test_inference_client.py -k <test_name> --vcr-record=all
=> generates a YAML file to mock HTTP call to avoid using the InferenceAPI prod for CI tests (see comment)make style
=> will format code correctly AND generate theasyncio
version of the code out of the boxmake quality
=> make sure the code is correctly formatted/typed => don't hesitate to ask for help if neededRemaining tasks:
The text was updated successfully, but these errors were encountered: