-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NameValueCollection.ToQueryString performance optimisation #4952
Conversation
This commit optimises the internal `ToQueryString` extension method on the `NameValueCollection` type. It avoids some intermediate string allocations by using a zero-alloc buffer to format the final query string. - Adds a test for the extension method to validate existing behaviour is unaffected by the change of implementation. - Adds the `System.Memory` package to support `Span<T>` usage across all target frameworks. This is registered conditionally for targets before NetStandard 2.1. I had some downgrade errors when using the latest package version, so I've aligned with the existing `System.Buffers` version for the time being.
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!, I did suggest one more test case testing the encoding of a 4byte character.
The failing windows integration test is a request timeout on Tests.Indices.IndexManagement.RolloverIndex.RolloverIndexApiTests.
and is unrelated.
@Mpdreamz Modified the code slightly to assume worst case that every byte requires 3 char encoding. This avoids overflowing the buffer, even for strings entirely made up of emojis! Added suggested test + one other for 2 byte character. |
maxLength += 1 + (key.Length + nv[key]?.Length ?? 0) * 3; // '=' char + worst case assume all key/value chars are escaped | ||
var bytes = Encoding.UTF8.GetByteCount(key) + Encoding.UTF8.GetByteCount(nv[key] ?? string.Empty); | ||
var maxEncodedSize = bytes * 3; // worst case, assume all bytes are URL escaped to 3 chars | ||
maxLength += 1 + maxEncodedSize; // '=' + encoded chars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the 1
character padding could potentially 0
when nv[key]
is empty. Since we don't write =
. Not worth the conditional and added complexity though!
The integration test failures are due to changes on the SNAPSHOT version of the server that we have not addressed yet. Opened #4957 to fix those. The auto label failure is due to me creating a new label after this PR was opened and applying it to this PR. Thanks for the updates @stevejgordon! |
This commit optimises the internal
ToQueryString
extension method on theNameValueCollection
type. It avoids some intermediate string allocations by using a zero-alloc buffer to format the final query string.System.Memory
package to supportSpan<T>
usage across all target frameworks. This is registered conditionally for targets before NetStandard 2.1. I had some downgrade errors when using the latest package version, so I've aligned with the existingSystem.Buffers
version for the time being.I've opened this against master although it may be valuable to back port to existing supported versions?
The exact performance improvement varies based on the underlying collection, but in all cases resulted in less allocations and executes equal to or faster than the original code.
.NET Core 2.1 Benchmark Results
For smaller collections and key/value lengths, the allocation improvement can be greater. I've seen up to 77% on some test cases where no URI encoding need occur.
Closes #4951