Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change SipHash implementation to an optimized assembly version #35735

Closed
brson opened this issue Aug 16, 2016 · 13 comments
Closed

Change SipHash implementation to an optimized assembly version #35735

brson opened this issue Aug 16, 2016 · 13 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@brson
Copy link
Contributor

brson commented Aug 16, 2016

It's crucial that our secure hash maps are as fast as we possibly can make them, and there's more we can do yet. This is worth dropping down to assembly or even C for.

The best candidate is /~https://github.com/google/highwayhash. Unfortunately, it is apache2 only. We would need to get them to dual-license, but let's not bug them until we're sure we have a patch that's acceptable on our end.

This implementation requires AVX2, which we can not count on any of our shipping targets to support. But it seems relatively simple to me to do cpu feature detection dynamically in some strategic place, then drop to the fast version.

So the steps here:

  • Out of tree, create a SipHash13/24 implementation in Rust, backed by the highwaysh implementation
  • Benchmark it to prove its fast enough to be worth the effort
  • Add the highwayhash siphash implementation to src/rt, and add it to the build system(s)
  • Add the optimized SipHash13/24 implementation to std, alongside the existing
  • Create another SipHash13/24 definition that delegates to one of the others. During construction detect AVX2 and set a flag. Dispatch based on flag.

An obvious concern is the overhead of that flag, and of the FFI crossing.

cc @alexcrichton what you think?

cc @briansmith

@brson brson added A-libs I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Aug 16, 2016
@alexcrichton
Copy link
Member

I've updated /~https://github.com/alexcrichton/siphash-bench to include the highwayhash repo, and the results look like (on nightly):

test c1_siphash24::str_long     ... bench:       5,346 ns/iter (+/- 8)
test c1_siphash24::str_medium   ... bench:         283 ns/iter (+/- 19)
test c1_siphash24::str_small    ... bench:          16 ns/iter (+/- 1)
test c2_siphash24::str_long     ... bench:       3,700 ns/iter (+/- 3)
test c2_siphash24::str_medium   ... bench:         202 ns/iter (+/- 1)
test c2_siphash24::str_small    ... bench:          15 ns/iter (+/- 1)
test cpp_siphash::str_long      ... bench:       5,241 ns/iter (+/- 21)
test cpp_siphash::str_medium    ... bench:         267 ns/iter (+/- 3)
test cpp_siphash::str_small     ... bench:          17 ns/iter (+/- 0)
test rust_siphash13::str_long   ... bench:       2,282 ns/iter (+/- 13)
test rust_siphash13::str_medium ... bench:         123 ns/iter (+/- 1)
test rust_siphash13::str_small  ... bench:           8 ns/iter (+/- 1)
test rust_siphash24::str_long   ... bench:       4,223 ns/iter (+/- 44)
test rust_siphash24::str_medium ... bench:         218 ns/iter (+/- 0)
test rust_siphash24::str_small  ... bench:          11 ns/iter (+/- 0)

That is, SipHash13 in our standard library still beats out everything. There's a lot of hashes in that repo, though, and I could be building it wrong. @brson could you clarify which one you were referring to?

@brson
Copy link
Contributor Author

brson commented Aug 17, 2016

Huh, siphash in the highway hash repo is not AVX2 optimized. It's just compatible C. It's only SipTreeHash and HighwayTreeHash that are AVX2 optimized. Neither are the same algorithm (though I think any crypto-hash is fair-game for the default).

But furthermore, the AVX2 optimized hashes are even slower for small inputs than siphash, so we would really need a way to determine the two cases. Frustrating.

@brson
Copy link
Contributor Author

brson commented Aug 17, 2016

Here's what I think I want in an ideal world:

  • The hasher buffers inputs up to some number of bytes that determines a 'big' input
  • If finish is called before the input reaches that size it uses a hypothetical crypto-safe hash that is fast for small inputs (I don't know if such a thing exists)
  • Otherwise it does cpu detection and if it has AVX2 drops to HighwayTreeHash / SipTreeHash
  • If no AVX2 then SipHash13

@brson
Copy link
Contributor Author

brson commented Aug 17, 2016

@alexcrichton In your benches c2_siphash24 is faster than rust_siphash24, or am I misreading?

@alexcrichton
Copy link
Member

@brson yes I believe that c2 implementation is faster than the Rust implementation, no idea why though (just threw up some numbers)

@sfackler
Copy link
Member

SipHash was specifically designed to be fast for small inputs and I think most of Google's effort for things like CityHash have been targeted around high throughput for long strings.

@bluss
Copy link
Member

bluss commented Aug 20, 2016

Here's another person who implemented sse2/ssse3 siphash and if I understand what they mean correctly, on x86-64, their basic (no explict simd) version was the best. /~https://github.com/floodyberry/siphash

It seems there's a few things pointing towards siphash not being that amenable for optimization using sse/avx?

@briansmith
Copy link
Contributor

It seems there's a few things pointing towards siphash not being that amenable for optimization using sse/avx?

This is why Google made siptreehash: /~https://github.com/google/highwayhash#siptreehash.

Note that also Java abandoned the whole "strong hash function" approach due to performance concerns, in favor of a totally different way of structuring the hash table buckets in the face of many collisions: https://bugs.openjdk.java.net/browse/JDK-8046170

@brson
Copy link
Contributor Author

brson commented Aug 23, 2016

Thanks for that link @briansmith.

In the spirit of the original OP i guess the only thing to do here is to replace the Rust siphash with the C siphash if it is indeed faster, which @alexcrichton's benchmarks indicate may be the case, by a decent percent, assuming the C siphash13 performance compares to the Rust performance similarly to 2-4.

@brson
Copy link
Contributor Author

brson commented Aug 23, 2016

Although if we were to add new C code to std that invalidates some of the logic in the -C target-feature=cstatic RFC, which depends on std not containing C code (because MSVC has to compile it differently).

@bluss
Copy link
Member

bluss commented Aug 23, 2016

One way to make rust's siphasher faster is to enable one-shot hashing, there's an ongoing rfc for one approach at that rust-lang/rfcs#1666

one-shot hashing would be that the SipHasher type receives a whole byte slice that it can hash into a final hash value in one go. Currently SipHasher is implemented to take chunks at a time, and must have logic that keeps track of non-multiple-of-8 remainders between each call to Hasher::write.

The C implementation of siphash that was compared with does it all in one shot, so that's certainly a factor in its favour. I'm sure Rust would be equivalent if it did too.

bors added a commit that referenced this issue Oct 26, 2016
Small improvement to SipHasher

Very small but constant improvement, the objective is to lower latency for u16, u32 and small strings.

CC #35735

```
➜  siphash-bench git:(master) ✗ sudo nice -n -20 target/release/foo-648738a54f390643 --bench | tee benches.txt
[sudo] password for arthurprs:

running 62 tests
test _same                       ... bench:           0 ns/iter (+/- 0)
test _warmup                     ... bench:           0 ns/iter (+/- 0)
test rust_siphash13::int_u16     ... bench:          12 ns/iter (+/- 1)
test rust_siphash13::int_u32     ... bench:          14 ns/iter (+/- 0)
test rust_siphash13::int_u64     ... bench:          11 ns/iter (+/- 1)
test rust_siphash13::int_u8      ... bench:          11 ns/iter (+/- 1)
test rust_siphash13::slice::_10  ... bench:          18 ns/iter (+/- 1)
test rust_siphash13::slice::_100 ... bench:          42 ns/iter (+/- 2)
test rust_siphash13::slice::_11  ... bench:          19 ns/iter (+/- 1)
test rust_siphash13::slice::_12  ... bench:          21 ns/iter (+/- 3)
test rust_siphash13::slice::_2   ... bench:          16 ns/iter (+/- 2)
test rust_siphash13::slice::_200 ... bench:          68 ns/iter (+/- 3)
test rust_siphash13::slice::_3   ... bench:          17 ns/iter (+/- 3)
test rust_siphash13::slice::_4   ... bench:          18 ns/iter (+/- 1)
test rust_siphash13::slice::_5   ... bench:          19 ns/iter (+/- 4)
test rust_siphash13::slice::_6   ... bench:          19 ns/iter (+/- 1)
test rust_siphash13::slice::_7   ... bench:          20 ns/iter (+/- 1)
test rust_siphash13::slice::_8   ... bench:          16 ns/iter (+/- 1)
test rust_siphash13::slice::_9   ... bench:          18 ns/iter (+/- 2)
test rust_siphash13::str_::_10   ... bench:          18 ns/iter (+/- 1)
test rust_siphash13::str_::_100  ... bench:          41 ns/iter (+/- 2)
test rust_siphash13::str_::_11   ... bench:          19 ns/iter (+/- 1)
test rust_siphash13::str_::_12   ... bench:          20 ns/iter (+/- 2)
test rust_siphash13::str_::_2    ... bench:          16 ns/iter (+/- 1)
test rust_siphash13::str_::_200  ... bench:          68 ns/iter (+/- 3)
test rust_siphash13::str_::_3    ... bench:          17 ns/iter (+/- 1)
test rust_siphash13::str_::_4    ... bench:          18 ns/iter (+/- 2)
test rust_siphash13::str_::_5    ... bench:          19 ns/iter (+/- 6)
test rust_siphash13::str_::_6    ... bench:          20 ns/iter (+/- 5)
test rust_siphash13::str_::_7    ... bench:          23 ns/iter (+/- 1)
test rust_siphash13::str_::_8    ... bench:          15 ns/iter (+/- 1)
test rust_siphash13::str_::_9    ... bench:          17 ns/iter (+/- 1)
test sip1b::int_u16              ... bench:          10 ns/iter (+/- 1)
test sip1b::int_u32              ... bench:           9 ns/iter (+/- 1)
test sip1b::int_u64              ... bench:          12 ns/iter (+/- 1)
test sip1b::int_u8               ... bench:           7 ns/iter (+/- 0)
test sip1b::slice::_10           ... bench:          12 ns/iter (+/- 1)
test sip1b::slice::_100          ... bench:          33 ns/iter (+/- 2)
test sip1b::slice::_11           ... bench:          13 ns/iter (+/- 0)
test sip1b::slice::_12           ... bench:          12 ns/iter (+/- 1)
test sip1b::slice::_2            ... bench:          10 ns/iter (+/- 0)
test sip1b::slice::_200          ... bench:          62 ns/iter (+/- 2)
test sip1b::slice::_3            ... bench:          10 ns/iter (+/- 1)
test sip1b::slice::_4            ... bench:           9 ns/iter (+/- 0)
test sip1b::slice::_5            ... bench:          10 ns/iter (+/- 1)
test sip1b::slice::_6            ... bench:          10 ns/iter (+/- 0)
test sip1b::slice::_7            ... bench:          11 ns/iter (+/- 0)
test sip1b::slice::_8            ... bench:          11 ns/iter (+/- 1)
test sip1b::slice::_9            ... bench:          12 ns/iter (+/- 1)
test sip1b::str_::_10            ... bench:          15 ns/iter (+/- 1)
test sip1b::str_::_100           ... bench:          37 ns/iter (+/- 3)
test sip1b::str_::_11            ... bench:          16 ns/iter (+/- 1)
test sip1b::str_::_12            ... bench:          14 ns/iter (+/- 1)
test sip1b::str_::_2             ... bench:          13 ns/iter (+/- 1)
test sip1b::str_::_200           ... bench:          67 ns/iter (+/- 5)
test sip1b::str_::_3             ... bench:          14 ns/iter (+/- 2)
test sip1b::str_::_4             ... bench:          12 ns/iter (+/- 1)
test sip1b::str_::_5             ... bench:          13 ns/iter (+/- 1)
test sip1b::str_::_6             ... bench:          13 ns/iter (+/- 0)
test sip1b::str_::_7             ... bench:          16 ns/iter (+/- 1)
test sip1b::str_::_8             ... bench:          14 ns/iter (+/- 1)
test sip1b::str_::_9             ... bench:          15 ns/iter (+/- 1)

test result: ok. 0 passed; 0 failed; 0 ignored; 62 measured

➜  siphash-bench git:(master) ✗ cargo benchcmp rust_siphash13:: sip1b:: benches.txt
 name         rust_siphash13:: ns/iter  sip1b:: ns/iter  diff ns/iter   diff %
 int_u16      12                        10                         -2  -16.67%
 int_u32      14                        9                          -5  -35.71%
 int_u64      11                        12                          1    9.09%
 int_u8       11                        7                          -4  -36.36%
 slice::_10   18                        12                         -6  -33.33%
 slice::_100  42                        33                         -9  -21.43%
 slice::_11   19                        13                         -6  -31.58%
 slice::_12   21                        12                         -9  -42.86%
 slice::_2    16                        10                         -6  -37.50%
 slice::_200  68                        62                         -6   -8.82%
 slice::_3    17                        10                         -7  -41.18%
 slice::_4    18                        9                          -9  -50.00%
 slice::_5    19                        10                         -9  -47.37%
 slice::_6    19                        10                         -9  -47.37%
 slice::_7    20                        11                         -9  -45.00%
 slice::_8    16                        11                         -5  -31.25%
 slice::_9    18                        12                         -6  -33.33%
 str_::_10    18                        15                         -3  -16.67%
 str_::_100   41                        37                         -4   -9.76%
 str_::_11    19                        16                         -3  -15.79%
 str_::_12    20                        14                         -6  -30.00%
 str_::_2     16                        13                         -3  -18.75%
 str_::_200   68                        67                         -1   -1.47%
 str_::_3     17                        14                         -3  -17.65%
 str_::_4     18                        12                         -6  -33.33%
 str_::_5     19                        13                         -6  -31.58%
 str_::_6     20                        13                         -7  -35.00%
 str_::_7     23                        16                         -7  -30.43%
 str_::_8     15                        14                         -1   -6.67%
 str_::_9     17                        15                         -2  -11.76%

```

from a modified hash-rs suite (preallocating maps and adding having slice/str variants)

graph version: http://imgur.com/a/DuoI4

```
➜  hash-rs git:(rfc-extend-hasher) ✗ cargo benchcmp sip13:: sip13opt:: benches.txt
 name                             sip13:: ns/iter      sip13opt:: ns/iter   diff ns/iter   diff %
 slice::mapcountdense_000000001   27,343 (36 MB/s)     26,401 (37 MB/s)             -942   -3.45%
 slice::mapcountdense_000000002   28,982 (69 MB/s)     26,807 (74 MB/s)           -2,175   -7.50%
 slice::mapcountdense_000000003   29,304 (102 MB/s)    27,360 (109 MB/s)          -1,944   -6.63%
 slice::mapcountdense_000000004   30,411 (131 MB/s)    25,888 (154 MB/s)          -4,523  -14.87%
 slice::mapcountdense_000000005   32,625 (153 MB/s)    27,486 (181 MB/s)          -5,139  -15.75%
 slice::mapcountdense_000000006   34,920 (171 MB/s)    27,204 (220 MB/s)          -7,716  -22.10%
 slice::mapcountdense_000000007   33,497 (208 MB/s)    28,330 (247 MB/s)          -5,167  -15.43%
 slice::mapcountdense_000000008   31,153 (256 MB/s)    28,617 (279 MB/s)          -2,536   -8.14%
 slice::mapcountdense_000000009   30,745 (292 MB/s)    29,666 (303 MB/s)          -1,079   -3.51%
 slice::mapcountdense_000000010   31,509 (317 MB/s)    29,804 (335 MB/s)          -1,705   -5.41%
 slice::mapcountdense_000000011   32,526 (338 MB/s)    30,520 (360 MB/s)          -2,006   -6.17%
 slice::mapcountdense_000000012   32,981 (363 MB/s)    28,739 (417 MB/s)          -4,242  -12.86%
 slice::mapcountdense_000000013   34,713 (374 MB/s)    30,348 (428 MB/s)          -4,365  -12.57%
 slice::mapcountdense_000000014   34,635 (404 MB/s)    29,974 (467 MB/s)          -4,661  -13.46%
 slice::mapcountdense_000000015   35,924 (417 MB/s)    30,584 (490 MB/s)          -5,340  -14.86%
 slice::mapcountdense_000000016   31,939 (500 MB/s)    30,564 (523 MB/s)          -1,375   -4.31%
 slice::mapcountdense_000000032   36,545 (875 MB/s)    34,833 (918 MB/s)          -1,712   -4.68%
 slice::mapcountdense_000000064   44,691 (1432 MB/s)   43,912 (1457 MB/s)           -779   -1.74%
 slice::mapcountdense_000000128   67,210 (1904 MB/s)   64,630 (1980 MB/s)         -2,580   -3.84%
 slice::mapcountdense_000000256   110,320 (2320 MB/s)  108,713 (2354 MB/s)        -1,607   -1.46%
 slice::mapcountsparse_000000001  29,686 (33 MB/s)     28,673 (34 MB/s)           -1,013   -3.41%
 slice::mapcountsparse_000000002  32,073 (62 MB/s)     30,519 (65 MB/s)           -1,554   -4.85%
 slice::mapcountsparse_000000003  33,184 (90 MB/s)     31,208 (96 MB/s)           -1,976   -5.95%
 slice::mapcountsparse_000000004  34,344 (116 MB/s)    30,242 (132 MB/s)          -4,102  -11.94%
 slice::mapcountsparse_000000005  34,536 (144 MB/s)    30,552 (163 MB/s)          -3,984  -11.54%
 slice::mapcountsparse_000000006  35,791 (167 MB/s)    30,813 (194 MB/s)          -4,978  -13.91%
 slice::mapcountsparse_000000007  36,773 (190 MB/s)    31,362 (223 MB/s)          -5,411  -14.71%
 slice::mapcountsparse_000000008  33,101 (241 MB/s)    32,399 (246 MB/s)            -702   -2.12%
 slice::mapcountsparse_000000009  34,025 (264 MB/s)    33,065 (272 MB/s)            -960   -2.82%
 slice::mapcountsparse_000000010  34,755 (287 MB/s)    33,152 (301 MB/s)          -1,603   -4.61%
 slice::mapcountsparse_000000011  35,682 (308 MB/s)    33,631 (327 MB/s)          -2,051   -5.75%
 slice::mapcountsparse_000000012  36,422 (329 MB/s)    32,604 (368 MB/s)          -3,818  -10.48%
 slice::mapcountsparse_000000013  37,561 (346 MB/s)    32,978 (394 MB/s)          -4,583  -12.20%
 slice::mapcountsparse_000000014  38,476 (363 MB/s)    33,376 (419 MB/s)          -5,100  -13.26%
 slice::mapcountsparse_000000015  39,202 (382 MB/s)    33,750 (444 MB/s)          -5,452  -13.91%
 slice::mapcountsparse_000000016  34,898 (458 MB/s)    33,621 (475 MB/s)          -1,277   -3.66%
 slice::mapcountsparse_000000032  39,767 (804 MB/s)    38,013 (841 MB/s)          -1,754   -4.41%
 slice::mapcountsparse_000000064  47,810 (1338 MB/s)   46,332 (1381 MB/s)         -1,478   -3.09%
 slice::mapcountsparse_000000128  64,519 (1983 MB/s)   63,322 (2021 MB/s)         -1,197   -1.86%
 slice::mapcountsparse_000000256  101,042 (2533 MB/s)  99,754 (2566 MB/s)         -1,288   -1.27%
 str_::mapcountdense_000000001    27,183 (36 MB/s)     24,007 (41 MB/s)           -3,176  -11.68%
 str_::mapcountdense_000000002    28,940 (69 MB/s)     24,574 (81 MB/s)           -4,366  -15.09%
 str_::mapcountdense_000000003    29,000 (103 MB/s)    24,687 (121 MB/s)          -4,313  -14.87%
 str_::mapcountdense_000000004    29,822 (134 MB/s)    24,377 (164 MB/s)          -5,445  -18.26%
 str_::mapcountdense_000000005    31,962 (156 MB/s)    25,184 (198 MB/s)          -6,778  -21.21%
 str_::mapcountdense_000000006    32,218 (186 MB/s)    25,020 (239 MB/s)          -7,198  -22.34%
 str_::mapcountdense_000000007    35,482 (197 MB/s)    27,705 (252 MB/s)          -7,777  -21.92%
 str_::mapcountdense_000000008    28,643 (279 MB/s)    25,563 (312 MB/s)          -3,080  -10.75%
 str_::mapcountdense_000000009    30,112 (298 MB/s)    26,773 (336 MB/s)          -3,339  -11.09%
 str_::mapcountdense_000000010    31,554 (316 MB/s)    27,607 (362 MB/s)          -3,947  -12.51%
 str_::mapcountdense_000000011    32,062 (343 MB/s)    27,770 (396 MB/s)          -4,292  -13.39%
 str_::mapcountdense_000000012    32,258 (372 MB/s)    25,612 (468 MB/s)          -6,646  -20.60%
 str_::mapcountdense_000000013    33,544 (387 MB/s)    26,908 (483 MB/s)          -6,636  -19.78%
 str_::mapcountdense_000000014    34,681 (403 MB/s)    27,267 (513 MB/s)          -7,414  -21.38%
 str_::mapcountdense_000000015    37,883 (395 MB/s)    30,226 (496 MB/s)          -7,657  -20.21%
 str_::mapcountdense_000000016    30,299 (528 MB/s)    27,960 (572 MB/s)          -2,339   -7.72%
 str_::mapcountdense_000000032    34,372 (930 MB/s)    32,736 (977 MB/s)          -1,636   -4.76%
 str_::mapcountdense_000000048    38,610 (1243 MB/s)   36,437 (1317 MB/s)         -2,173   -5.63%
 str_::mapcountdense_000000064    43,052 (1486 MB/s)   41,269 (1550 MB/s)         -1,783   -4.14%
 str_::mapcountdense_000000128    64,059 (1998 MB/s)   62,007 (2064 MB/s)         -2,052   -3.20%
 str_::mapcountdense_000000256    109,608 (2335 MB/s)  107,184 (2388 MB/s)        -2,424   -2.21%
 str_::mapcountsparse_000000001   29,155 (34 MB/s)     26,151 (38 MB/s)           -3,004  -10.30%
 str_::mapcountsparse_000000002   31,536 (63 MB/s)     27,787 (71 MB/s)           -3,749  -11.89%
 str_::mapcountsparse_000000003   32,524 (92 MB/s)     27,861 (107 MB/s)          -4,663  -14.34%
 str_::mapcountsparse_000000004   33,535 (119 MB/s)    27,585 (145 MB/s)          -5,950  -17.74%
 str_::mapcountsparse_000000005   34,239 (146 MB/s)    27,520 (181 MB/s)          -6,719  -19.62%
 str_::mapcountsparse_000000006   35,485 (169 MB/s)    27,437 (218 MB/s)          -8,048  -22.68%
 str_::mapcountsparse_000000007   39,098 (179 MB/s)    30,465 (229 MB/s)          -8,633  -22.08%
 str_::mapcountsparse_000000008   30,882 (259 MB/s)    29,215 (273 MB/s)          -1,667   -5.40%
 str_::mapcountsparse_000000009   33,375 (269 MB/s)    29,301 (307 MB/s)          -4,074  -12.21%
 str_::mapcountsparse_000000010   33,531 (298 MB/s)    29,008 (344 MB/s)          -4,523  -13.49%
 str_::mapcountsparse_000000011   34,607 (317 MB/s)    29,800 (369 MB/s)          -4,807  -13.89%
 str_::mapcountsparse_000000012   35,700 (336 MB/s)    28,380 (422 MB/s)          -7,320  -20.50%
 str_::mapcountsparse_000000013   36,692 (354 MB/s)    29,350 (442 MB/s)          -7,342  -20.01%
 str_::mapcountsparse_000000014   37,326 (375 MB/s)    29,285 (478 MB/s)          -8,041  -21.54%
 str_::mapcountsparse_000000015   41,098 (364 MB/s)    33,073 (453 MB/s)          -8,025  -19.53%
 str_::mapcountsparse_000000016   33,046 (484 MB/s)    30,717 (520 MB/s)          -2,329   -7.05%
 str_::mapcountsparse_000000032   37,471 (853 MB/s)    35,542 (900 MB/s)          -1,929   -5.15%
 str_::mapcountsparse_000000048   41,324 (1161 MB/s)   39,332 (1220 MB/s)         -1,992   -4.82%
 str_::mapcountsparse_000000064   45,858 (1395 MB/s)   43,802 (1461 MB/s)         -2,056   -4.48%
 str_::mapcountsparse_000000128   62,471 (2048 MB/s)   60,683 (2109 MB/s)         -1,788   -2.86%
 str_::mapcountsparse_000000256   101,283 (2527 MB/s)  97,655 (2621 MB/s)         -3,628   -3.58%
```
@Mark-Simulacrum Mark-Simulacrum added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jul 25, 2017
@waterlens
Copy link
Contributor

Since the SipHasher has been deprecated in the core library, I think the improvement is not important. So the issue could be closed.

@Amanieu
Copy link
Member

Amanieu commented Jul 16, 2021

It doesn't seem like the customized version of SipHash brings any performance improvement over what we already have, so I'm just going to close this.

@Amanieu Amanieu closed this as completed Jul 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

9 participants