Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dora_datacollector_updated #2197

Merged
merged 39 commits into from
Nov 4, 2024
Merged

Conversation

shirinyamani
Copy link
Contributor

This PR just updates the datacollector for fine-tuning using Dora

shirinyamani and others added 30 commits June 19, 2024 23:15
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this.

I think when creating the PR, something went wrong, as it shows 39 commits, even though there is only one change. But it's no big deal, I can still merge.

@BenjaminBossan BenjaminBossan merged commit 4e57aa5 into huggingface:main Nov 4, 2024
14 checks passed
@shirinyamani
Copy link
Contributor Author

Hi Benjamin @BenjaminBossan
Thanks for all the good you do. I just added the implementation SnapKV Cache paper to cache_utils.py file. To reflect the SnapKV approach, I added the implementation on llama_modeling under llama_snapkv.py for the user to see what changes has to be applied to flash_attention2 to reflect the snap_kv, however, I'm not entirely sure tht there is the best location for this implementation. I’m confident that the main SnapKV logic belongs in cache_utils.py, but I would really appreciate your insights on whether the llama_snapkv.py example is well-placed or if it should be integrated differently.
Much appreciated!

@BenjaminBossan
Copy link
Member

@shirinyamani Thanks for working on this. I'm not a transformers maintainer, so I can't really tell you where best to implement this feature. It's best if you reach out to the transformers folks for this. What you could also do is create a draft PR with your implementation the way that you suggested, I'm sure they'll let you know if you have to change something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants