Differing impls for bits and bytes #44

sharksforarms · 2020-06-04T12:43:02Z

If bytes attribute is used and the index is on a byte boundary, it may be quicker to read from a slice of &[u8] instead of reading 8*n bits.

I'd like for more benchmarks to be written before so this optimization can be measured

One option could be to feature flag the bits/bytes attributes

The text was updated successfully, but these errors were encountered:

constfold · 2020-07-28T06:51:37Z

Since #61 has been merged, this could be done by:

Add a new type ByteSize
Add specialization like impl DekuRead<ByteSize> for xxx
bytes uses type ByteSize, bits uses type BitSize, then compiler will decide which impl should be called.

sharksforarms · 2020-07-30T14:44:13Z

Great idea, I'd like to see a better suite of benchmarks before this is tackled so we can measure the difference

#25

Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See sharksforarms#44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See sharksforarms#25

Remove Size in favor of a BitSize and ByteSize. This allows the Deku{Read/Write}(Endian, BitSize) and Deku{Read/Write}(Endian, ByteSize) impls to be created and allow the ByteSize impls to be faster. Most of the assumption I made (and the perf that is gained) is from the removal of this branch for bytes: let value = if pad == 0 && bit_slice.len() == max_type_bits && bit_slice.as_raw_slice().len() * 8 == max_type_bits { See #44 I Added some benchmarks, in order to get a good idea of what perf this allows. The benchmarks look pretty good, but I'm on my old thinkpad. These results are vs the benchmarks running from master. deku_read_vec time: [744.39 ns 751.13 ns 759.58 ns] change: [-23.681% -21.358% -19.142%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe See #25

wcampbell0x2a · 2024-05-24T03:54:39Z

Closing, the reader does this

sharksforarms changed the title ~~Optimization: Differing impls for bits and bytes~~ Differing impls for bits and bytes Jun 4, 2020

sharksforarms added the optimization Optimization label Jun 4, 2020

sharksforarms mentioned this issue Nov 26, 2020

fix: Convert BitSize to a enum with two variants, Bits and Bytes #138

Merged

wcampbell0x2a mentioned this issue Sep 28, 2022

Specialize bytes #278

Merged

wcampbell0x2a mentioned this issue Aug 3, 2023

Impl from_reader #352

Merged

8 tasks

wcampbell0x2a closed this as completed May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differing impls for bits and bytes #44

Differing impls for bits and bytes #44

sharksforarms commented Jun 4, 2020

constfold commented Jul 28, 2020

sharksforarms commented Jul 30, 2020

wcampbell0x2a commented May 24, 2024

Differing impls for bits and bytes #44

Differing impls for bits and bytes #44

Comments

sharksforarms commented Jun 4, 2020

constfold commented Jul 28, 2020

sharksforarms commented Jul 30, 2020

wcampbell0x2a commented May 24, 2024