Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

float8 training: make the "config from recipe" API polished #1731

Merged
merged 5 commits into from
Feb 19, 2025

Conversation

vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Feb 18, 2025

Summary:

This PR makes the API that takes a recipe name (enum or string) and
returns a Float8LinearConfig instance more polished and ready for
usage in README.md docs and by partner callsites such as torchtitan and
torchtune.

Test Plan:

./test/float8/test_everything.sh

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
[ghstack-poisoned]
@vkuzo
Copy link
Contributor Author

vkuzo commented Feb 18, 2025

Copy link

pytorch-bot bot commented Feb 18, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1731

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vkuzo added a commit that referenced this pull request Feb 18, 2025
Summary:

This PR makes the API that takes a recipe name (enum or string) and
returns a `Float8LinearConfig` instance more polished and ready for
usage in README.md docs and by partner callsites such as torchtitan and
torchtune.

Test Plan:

```
./test/float8/test_everything.sh
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 4f72eeb19603d6e1203fa9bf6ce8235bf431ecad
ghstack-comment-id: 2667010633
Pull Request resolved: #1731
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 18, 2025
[ghstack-poisoned]
class Float8LinearRecipeName(enum.Enum):
TENSORWISE = "tensorwise"
ROWWISE = "rowwise"
ROWWISE_WITH_GW_HP = "rowwise_with_gw_hp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wondering if gw_hp should be clarified somewhere as far as what it means?
it might also make sense to allow a fully written out version

@vkuzo vkuzo added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Feb 19, 2025
[ghstack-poisoned]
[ghstack-poisoned]
@vkuzo vkuzo changed the base branch from gh/vkuzo/34/head to main February 19, 2025 03:58
@vkuzo vkuzo merged commit c6c388b into main Feb 19, 2025
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants