Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[integration] Work towards full model2vec integration #3182

Merged
merged 2 commits into from
Jan 27, 2025

Conversation

tomaarsen
Copy link
Collaborator

@tomaarsen tomaarsen commented Jan 21, 2025

Hello!

Pull Request overview

  • Be able to load Model2Vec weight files as long as the model itself has a modules.json pointing to StaticEmbedding.

Details

This PR is to work towards compatibility of:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("minishlab/potion-base-8M")
embeddings = model.encode(["It's wonderful weather!", "I love the weather today.", "It's raining cats and dogs."])
print(embeddings.shape)

similarity = model.similarity(embeddings, embeddings)
print(similarity)
(3, 256)
tensor([[1.0000, 0.7467, 0.3813],
        [0.7467, 1.0000, 0.4091],
        [0.3813, 0.4091, 1.0000]])

Beyond the changes in this PR, the Model2Vec models must be updated with a modules.json like:

[
    {
        "idx": 0,
        "name": "0",
        "path": ".",
        "type": "sentence_transformers.models.StaticEmbedding"
    }
]

or

[
    {
        "idx": 0,
        "name": "0",
        "path": ".",
        "type": "sentence_transformers.models.StaticEmbedding"
    },
    {
        "idx": 1,
        "name": "1",
        "path": "1_Normalize",
        "type": "sentence_transformers.models.Normalize"
    }
]

@Pringled and @stephantul are working towards the changes on the model2vec side.

  • Tom Aarsen

@tomaarsen tomaarsen merged commit 3fbc183 into UKPLab:master Jan 27, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant