Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New model: bigcode/santacoder #445

Merged
merged 10 commits into from
Nov 8, 2023

Conversation

ojh31
Copy link
Contributor

@ojh31 ojh31 commented Oct 31, 2023

Description

Added support for new model https://huggingface.co/bigcode/santacoder. I think it's too big to add to unit tests, but I added a notebook which shows that it agrees with HF.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@jbloomAus
Copy link
Collaborator

Looks good, thanks!

@jbloomAus jbloomAus merged commit b46ff94 into TransformerLensOrg:main Nov 8, 2023
8 checks passed
alan-cooney pushed a commit to SeuperHakkerJa/TransformerLens that referenced this pull request Nov 10, 2023
* Added santacoder to aliases

* Removed reference to multiquery parameter

* Added santacoder to tests

* Asserted that trust_remote_code=true for santacoder

* Added demo notebook for santacoder

* Removed print statements and forcibly set trust_remote_code=True

* Changed spacing and identation for black

* Removed model type hint in convert weights method

* Removed santacoder test due to memory issues

* Added back in print statement for loading pretrained model
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants