Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add rate limit handling for GitHub client #5226

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

lukaspj
Copy link
Contributor

@lukaspj lukaspj commented Jan 9, 2025

This is a recreation of #4926 as that one seems abandoned by the author

What

  • Add retry mechanism for GitHub API secondary rate limit
  • Implement a retry mechanism for GitHub client API calls when encountering secondary rate limit responses.

Why

  • In large-scale deployments, a single PR can affect hundreds of projects.
  • When attempting to plan/apply for a non-mergeable PR, Atlantis may try to set the PR status hundreds of times in a short period, triggering GitHub's secondary API rate limit.
  • This results in Atlantis appearing to stop mid-operation: some projects are marked as failed, and no comments are posted.
  • This issue can occur in other scenarios where Atlantis is used at scale.

Implementation

  • Utilize the go-github-ratelimit library, as recommended in the official go-github README:

    "You can use go-github-ratelimit to handle secondary rate limit sleep-and-retry for you."

Testing

  • Added a test that verifies an API request completes successfully after responding with API secondary rate limit errors multiple times.

References

go-github README
go-github-ratelimit library

@lukaspj lukaspj requested review from a team as code owners January 9, 2025 08:09
@lukaspj lukaspj requested review from jamengual, lukemassa and nitrocode and removed request for a team January 9, 2025 08:09
@dosubot dosubot bot added feature New functionality/enhancement go Pull requests that update Go code provider/github labels Jan 9, 2025
Signed-off-by: Lukas Peter Aldershaab <lukas.aldershaab@lego.com>
Signed-off-by: Lukas Peter Aldershaab <lukas.aldershaab@lego.com>
@lukaspj lukaspj force-pushed the feat/github-rate-limit-handling branch from db27064 to b56cb64 Compare January 9, 2025 08:10
@github-actions github-actions bot added the dependencies PRs that update a dependency file label Jan 9, 2025
@lukaspj
Copy link
Contributor Author

lukaspj commented Jan 9, 2025

For the record, we have tested this out on our setup which runs hundreds of Atlantis jobs simultaneously and it greatly improved our experience

chenrui333
chenrui333 previously approved these changes Jan 9, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 9, 2025
@chenrui333
Copy link
Member

Sounds like a good idea, @lukaspj is it configurable?

@lukaspj
Copy link
Contributor Author

lukaspj commented Jan 9, 2025

I can make it configurable for sure, the original PR had discussions on this as well. I’m indifferent towards configuration options because the current default seems very stable for us and we had rate limiting problems constantly before

@@ -124,23 +125,28 @@ func NewGithubClient(hostname string, credentials GithubCredentials, config Gith
return nil, errors.Wrap(err, "error initializing github authentication transport")
}

transportWithRateLimit, err := github_ratelimit.NewRateLimitWaiterClient(transport.Transport, github_ratelimit.WithTotalSleepLimit(time.Minute, nil))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a callback function passed to report the event of rate limit being exceeded via error log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one now with a simple warning, what do you think of that one?

Signed-off-by: Lukas Peter Aldershaab <lukas.aldershaab@lego.com>
@jamengual jamengual added waiting-on-review Waiting for a review from a maintainer waiting-on-response Waiting for a response from the user and removed waiting-on-review Waiting for a review from a maintainer labels Jan 19, 2025
@lukaspj
Copy link
Contributor Author

lukaspj commented Jan 19, 2025

What response is this waiting on? Anything I need to do?

@jamengual jamengual added waiting-on-review Waiting for a review from a maintainer and removed waiting-on-response Waiting for a response from the user labels Jan 19, 2025
@lukemassa
Copy link
Contributor

I'm personally ok with leaving this unconfigurable for now; we can always add it in later. One reason being without some real world experience it's hard to tell what the default should be (1 minute? 5 minutes? Disabled?). As long as it's announced with the upgrade, I have a feeling the PR as is will be a net positive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies PRs that update a dependency file feature New functionality/enhancement go Pull requests that update Go code lgtm This PR has been approved by a maintainer provider/github waiting-on-review Waiting for a review from a maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants