Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.3.0 issues loading model from local dir #2459

Closed
wjdunlop opened this issue Jan 30, 2024 · 4 comments · Fixed by #2460
Closed

2.3.0 issues loading model from local dir #2459

wjdunlop opened this issue Jan 30, 2024 · 4 comments · Fixed by #2460

Comments

@wjdunlop
Copy link

wjdunlop commented Jan 30, 2024

Encountering a problem that may be of similar origin with #2458, opening another issue as it may not be same exact source. Our issue was with loading some old models by model path where we have moved a .tar.gz of the file.

In my case, the failure got to huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/opt/ml/model'. Use repo_type argument if needed. We were loading the model in question into /opt/ml/model and loading it by dir name was not a problem prior to 2.3.0

Not sure if this is an issue with sentence-transformers or with some intermediate point between sentence-transformers and huggingface/transformers updates, both of which were using newer versions on this failing buidl

Ultimately, reverting back to 2.2.2 fixed the issue.

  File "/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 243, in handle
    self.initialize(context)
  File "/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 83, in initialize
    self.model = self.load(*([self.model_dir] + self.load_extra_arg))
  File "/opt/ml/model/code/inference.py", line 60, in model_fn
    return model(model_dir)
  File "/opt/ml/model/code/inference.py", line 67, in model
    model = SentenceTransformer(model_dir, device='cuda')
  File "/opt/conda/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 194, in __init__
    modules = self._load_sbert_model(
  File "/opt/conda/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 1062, in _load_sbert_model
    module_path = load_dir_path(
  File "/opt/conda/lib/python3.10/site-packages/sentence_transformers/util.py", line 537, in load_dir_path
    repo_path = snapshot_download(**download_kwargs)
  File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn
    validate_repo_id(arg_value)
  File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_id
    raise HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/opt/ml/model'. Use `repo_type` argument if needed.```

@tomaarsen
Copy link
Collaborator

tomaarsen commented Jan 30, 2024

Hello!

Thanks for reporting & including the stack trace. The issue seems to originate here:

# Try to download from the remote
try:
repo_path = snapshot_download(**download_kwargs)
except Exception:
# Otherwise, try local (i.e. cache) only
download_kwargs["local_files_only"] = True
repo_path = snapshot_download(**download_kwargs)

Normally, this code should not be reached for local models, as that is just after a check whether the model exists locally:
dir_path = os.path.join(model_name_or_path, directory)
if os.path.exists(dir_path):
return dir_path

My theory is that model_name_or_path (e.g. '/opt/ml/model' in your case) does exist, but that the model is searching for a specific directory in that model which does not exist (e.g. 2_Normalize). Could you check whether your /opt/ml/model/modules.json file contains a relative path in one of the "path" options that does not exist locally?

I theorize that this is indeed related to #2458.

  • Tom Aarsen

@wjdunlop
Copy link
Author

Hi there,

Yes, it does have a specific directory listed in modules.json that doesn't exist. It is looking for 2_Normalize which, on unzipping, is not present.

@tomaarsen
Copy link
Collaborator

tomaarsen commented Jan 30, 2024

Awesome, thanks for the information. I had expected as much. That should mean that my fix from #2460 should also resolve this issue. I'll try to bring out a patch release.

If you want, you can try it out for now via

pip install git+/~https://github.com/tomaarsen/sentence-transformers@hotfix/dont_require_normalize_files

Alternatively, you can create an empty 2_Normalize directory within the model directory and the problem should resolve itself.

  • Tom Aarsen

@wjdunlop
Copy link
Author

Thanks for the quick turnaround on this, Tom, I'll take a look.

Glad I could be of help, and thanks for the support of a very useful toolkit!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants