Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NameValueCollection.ToQueryString performance optimisation #4952

Merged
merged 2 commits into from
Aug 13, 2020
Merged

NameValueCollection.ToQueryString performance optimisation #4952

merged 2 commits into from
Aug 13, 2020

Conversation

stevejgordon
Copy link
Contributor

@stevejgordon stevejgordon commented Aug 10, 2020

This commit optimises the internal ToQueryString extension method on the NameValueCollection type. It avoids some intermediate string allocations by using a zero-alloc buffer to format the final query string.

  • Adds a test for the extension method to validate existing behaviour is unaffected by the change of implementation.
  • Adds the System.Memory package to support Span<T> usage across all target frameworks. This is registered conditionally for targets before NetStandard 2.1. I had some downgrade errors when using the latest package version, so I've aligned with the existing System.Buffers version for the time being.

I've opened this against master although it may be valuable to back port to existing supported versions?

The exact performance improvement varies based on the underlying collection, but in all cases resulted in less allocations and executes equal to or faster than the original code.

.NET Core 2.1 Benchmark Results

private NameValueCollection _nvc = new NameValueCollection
{
    { "q", "title:\"The Right Way\" AND mod_date:[20020101 TO 20030101]" },
    { "from", "10000" },
    { "request_cache", bool.TrueString },
    { "size", "100" }
};
|   Method |     Mean |     Error |    StdDev | Ratio | RatioSD |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|--------- |---------:|----------:|----------:|------:|--------:|-------:|------:|------:|----------:|
| Original | 4.105 us | 0.0901 us | 0.2614 us |  1.00 |    0.00 | 0.3052 |     - |     - |    1288 B |
|      New | 3.652 us | 0.0730 us | 0.1898 us |  0.89 |    0.07 | 0.1755 |     - |     - |     736 B |
  • Allocations reduced by 42.9%
  • 11% faster

For smaller collections and key/value lengths, the allocation improvement can be greater. I've seen up to 77% on some test cases where no URI encoding need occur.

Closes #4951

This commit optimises the internal `ToQueryString` extension method on the `NameValueCollection` type. It avoids some intermediate string allocations by using a zero-alloc buffer to format the final query string.

- Adds a test for the extension method to validate existing behaviour is unaffected by the change of implementation.
- Adds the `System.Memory` package to support `Span<T>` usage across all target frameworks. This is registered conditionally for targets before NetStandard 2.1. I had some downgrade errors when using the latest package version, so I've aligned with the existing `System.Buffers` version for the time being.
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

Copy link
Member

@Mpdreamz Mpdreamz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!, I did suggest one more test case testing the encoding of a 4byte character.

The failing windows integration test is a request timeout on Tests.Indices.IndexManagement.RolloverIndex.RolloverIndexApiTests. and is unrelated.

@stevejgordon
Copy link
Contributor Author

@Mpdreamz Modified the code slightly to assume worst case that every byte requires 3 char encoding. This avoids overflowing the buffer, even for strings entirely made up of emojis! Added suggested test + one other for 2 byte character.

maxLength += 1 + (key.Length + nv[key]?.Length ?? 0) * 3; // '=' char + worst case assume all key/value chars are escaped
var bytes = Encoding.UTF8.GetByteCount(key) + Encoding.UTF8.GetByteCount(nv[key] ?? string.Empty);
var maxEncodedSize = bytes * 3; // worst case, assume all bytes are URL escaped to 3 chars
maxLength += 1 + maxEncodedSize; // '=' + encoded chars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the 1 character padding could potentially 0 when nv[key] is empty. Since we don't write =. Not worth the conditional and added complexity though!

@Mpdreamz
Copy link
Member

The integration test failures are due to changes on the SNAPSHOT version of the server that we have not addressed yet. Opened #4957 to fix those.

The auto label failure is due to me creating a new label after this PR was opened and applying it to this PR.

Thanks for the updates @stevejgordon!

@Mpdreamz Mpdreamz merged commit dff2c6a into elastic:master Aug 13, 2020
@stevejgordon stevejgordon deleted the #4951 branch June 16, 2021 12:10
@stevejgordon stevejgordon restored the #4951 branch June 16, 2021 12:10
@stevejgordon stevejgordon deleted the #4951 branch June 16, 2021 12:10
stevejgordon added a commit that referenced this pull request Jun 16, 2021
…tion (#5731)

* NameValueCollection.ToQueryString performance optimisation (#4952)

* Update package locks

* Update license header and remove BOM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NameValueCollectionExtensions Performance Optimisation
3 participants