-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(registry/storage/driver/s3-aws): use a consistent multipart chunk size #4424
Conversation
@milosgajdos Would you be able to take a look? |
So we're on the same page, the main issue was this: distribution/registry/storage/driver/s3-aws/s3.go Lines 1654 to 1660 in f0bd0f6
The above would produce a huge final chunk (bigger than the specified chunk size) and would cause the upload to fail. |
We have spun up an instance of harbor backed by R2 using this PR and have pushed small and large (4.5Gb) images with no issue and no chunksize specified. The success appears to be consistent, and was failing on an identical setup using |
2580a22
to
cb95fa1
Compare
also @corhere @thaJeztah |
I have done some reading, and understand this comment has concerns with simply not writing all the data which has been uploaded. I would need to find a more reliable way to test resumable uploads, but I agree I imagine there there is a discrepancy between what the registry says it has uploaded, versus what it actually has uploaded. Imagine the best thing to do here would be to ensure |
13b95d9
to
b1e86de
Compare
I've pushed a change to address the above - essentially |
This comment was marked as resolved.
This comment was marked as resolved.
1732bdc
to
e8b0e55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to give this a proper look, I've only skimmed it. I also need to refresh my memory with the original issue, so might take a bit to review.
// maxChunkSize large which fits in to int. The reason why | ||
// we return int64 is to play nice with Go interfaces where | ||
// the buffer implements io.ReaderFrom interface. | ||
n, _ := w.buf.Write(p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably make a note here saying that error
is always nil
when writing to bytes.Buffer
.
8ea9ba6
to
6993235
Compare
This comment was marked as resolved.
This comment was marked as resolved.
|
||
n, err := w.ready.ReadFrom(resp.Body) | ||
if err != nil { | ||
if _, err := io.Copy(w.buf, resp.Body); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had concerns about this - but the above condition guarantees(?) this data will be less than minChunkSize
, which is "tiny" (5MB).
@@ -1389,8 +1332,7 @@ func (d *driver) newWriter(ctx context.Context, key, uploadID string, parts []*s | |||
uploadID: uploadID, | |||
parts: parts, | |||
size: size, | |||
ready: d.NewBuffer(), | |||
pending: d.NewBuffer(), | |||
buf: d.pool.Get().(*bytes.Buffer), | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question, unrelated to the changes here (sorry if distracting): Why/how could the parts ever become unordered? Shouldn't it be sufficient to sort the parts once when they are first fetched rather than when the upload is restarted or committed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm missing some context for this 🤔
This seems to solve the issue with Cloudflare, @tpoxa and I will test in on our feet and see if we find something. |
I will also try and test this myself :) |
Hope your tests went okay. If it's helpful, there is a prebuilt image here /~https://github.com/users/uhthomas/packages/container/package/distribution%2Fregistry |
@uhthomas @milosgajdos @corhere Here is our feedback, after testing this feature on R2 for our service offering container-registry.com as my colleague @tpoxa was responsible for the similar PR #3940 We have moved GiBs of date to R2 to test is. Among the tests were our test images vad1mo/1gb-random-file, vad1mo/10gb-random-file that contain 1 GiB sized layer.
So from our side the PR is good to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the PR on the first look looks ok to me. I left some comments. I need to give it another pass at some point.
docker-bake.hcl
Outdated
"linux/arm/v6", | ||
"linux/arm/v7", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, no. Let's not do this. We can't remove these. There are binaries have been released in this arch as well as official images /~https://github.com/distribution/distribution-library-image/blob/a943e89c3efe06134cd9a4b439203c5341082cbc/Dockerfile shipping them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree
@@ -1389,8 +1332,7 @@ func (d *driver) newWriter(ctx context.Context, key, uploadID string, parts []*s | |||
uploadID: uploadID, | |||
parts: parts, | |||
size: size, | |||
ready: d.NewBuffer(), | |||
pending: d.NewBuffer(), | |||
buf: d.pool.Get().(*bytes.Buffer), | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm missing some context for this 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the changes here look ok to me. @uhthomas do you wanna address some of my comments? I'd like to do an RC soon and I think it'd be great if we could snuck this change in. Also, this needs a rebase now.
de0a06c
to
9ed1f97
Compare
… size Some S3 compatible object storage systems like R2 require that all multipart chunks are the same size. This was mostly true before, except the final chunk was larger than the requested chunk size which causes uploads to fail. In addition, the two byte slices have been replaced with a single *bytes.Buffer and the surrounding code simplified significantly. Fixes: distribution#3873 Signed-off-by: Thomas Way <thomas@6f.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks kosher to me, thanks for the additional testing as well.
Thank you @milosgajdos, @Jamstah and everyone else who took the time to review and test. Really happy to see this finally get over the line! |
Some S3 compatible object storage systems like R2 require that all multipart chunks are the same size. This was mostly true before, except the final chunk was larger than the requested chunk size which causes uploads to fail.
In addition, the two byte slices have been replaced with a single *bytes.Buffer and the surrounding code simplified significantly.
Fixes: #3873