-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Support torchao #2062
FEAT: Support torchao #2062
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
With huggingface/transformers#33361 being merged (which marks torchao as traininable), once the next transformers version is released (>4.44.2), the GPU tests on this PR should pass (I tested locally). This PR should not be merged before that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making torchao compatible @BenjaminBossan ! LGTM ! Just a few nits.
cc @msaroufim
src/peft/tuners/lora/torchao.py
Outdated
# TODO | ||
rep = super().__repr__() | ||
return rep.replace("lora.Linear", f"lora.{self.__class__.__name__}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO left
raise ValueError(f"{type(self).__name__} only supports int8 weights for now.") | ||
|
||
def merge(self, safe_merge: bool = False, adapter_names: Optional[list[str]] = None) -> None: | ||
from torchao import quantize_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quantize_
is only available from torchao 0.4.0. Maybe we should modify a bit is_torchao_available
to take that into account ?
- min torchao version - remove TODO
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Supports torch AO quantization. Currently supported: - int8_weight_only - int8_dynamic_activation_int8_weight --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Add support for torchao.
The current status is:
int8_weight_only
works fullyint8_dynamic_activation_int8_weight
only works partly (asdequantize
is not supported, merging and DoRA won't work)int4_weight_only
not supported as some ops for forward call are missingnf4
not supported on transformers side