-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Grand Metadata Reform #22971
The Grand Metadata Reform #22971
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @huonw (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see CONTRIBUTING.md for more information. |
Wow, this looks amazing! Fantastic work @lifthrasiir!
I've found a good "benchmark" to just be timing a compile of "hello world" which involves reading a fair amount of metadata. Not super scientific but it may put you in the ballpark of whether it's 10x faster, 10x slower, or basically the same. |
This is awesome! |
//! | ||
//! - `F64` (`0a`): 8-byte big endian unsigned integer representing | ||
//! IEEE 754 binary64 floating-point format. | ||
//! - `F32` (`0b`): 4-byte big endian unsigned integer representing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could detect if an f64 can be represented as a f32 without loosing precision, and hence save 4 bytes (I guess this doesn't turn up much?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They actually don't exist in the metadata as far as I know, so it seemed fine as it is.
eef9c9a
to
3d1facd
Compare
Rebased onto 2ca6eae. Slightly fixed several commits (mostly error handling and type safety things) and added one commit for tag reordering. I'm currently rebuilding new commits. |
They are, with a conjunction of `start_tag` and `end_tag`, commonly used to write a document with a binary data of known size. However the use of `start_tag` makes the length always 4 bytes long, which is almost not optimal (requiring the relaxation step to remedy). Directly using `wr_tagged_*` methods is better for both readability and resulting metadata size.
EBML tags are encoded in a variable-length unsigned int (vuint), which is clever but causes some tags to be encoded in two bytes while there are really about 180 tags or so. Assuming that there wouldn't be, say, over 1,000 tags in the future, we can use much more efficient encoding scheme. The new scheme should support at most 4,096 tags anyway. This also flattens a scattered tag namespace (did you know that 0xa9 is followed by 0xb0?) and makes a room for autoserialized tags in 0x00 through 0x1f.
Many auto-serialization tags are fixed-size (note: many ordinary tags are also fixed-size but for now this commit ignores them), so having an explicit length is a waste. This moves any auto-serialization tags with an implicit length before other tags, so a test for them is easy. A preliminary experiment shows this has at least 1% gain over the status quo.
It doesn't serve any useful purpose. It *might* be useful when there are some tags that are generated by `Encodable` and not delimited by any tags, but IIUC it's not the case. Previous: <-------------------- len1 -------------------> EsEnum <len1> EsEnumVid <vid> EsEnumBody <len2> <arg1> <arg2> <--- len2 --> Now: <----------- len1 ----------> EsEnum <len1> EsEnumVid <vid> <arg1> <arg2>
For the reference, while it is designed to be selectively enabled, it was essentially enabled throughout every snapshot and nightly as far as I can tell. This makes the usefulness of `EsLabel` itself questionable, as it was quite rare that `EsLabel` broke the build. It had consumed about 20~30% of metadata (!) and so this should be a huge win.
They replace the existing `EsEnumVid`, `EsVecLen` and `EsMapLen` tags altogether; the meaning of them can be easily inferred from the enclosing tag. It also has an added benefit of encodings for smaller variant ids or lengths being more compact (5 bytes to 2 bytes).
We try to move the data when the length can be encoded in the much smaller number of bytes. This interferes with indices and type abbreviations however, so this commit introduces a public interface to get and mark a "stable" (i.e. not affected by relaxation) position of the current pointer. The relaxation logic only moves a small data, currently at most 256 bytes, as moving the data can be costly. There might be further opportunities to allow more relaxation by moving fields around, which I didn't seriously try.
So that `EsVec 82 EsSub8 00` becomes `EsVec 80` now.
This avoids a biggish eight-byte `tag_table_id` tag in favor of autoserialized integer tags, which are smaller and can be later used to encode them in the optimal number of bytes. `NodeId` was u32 after all. Previously: <------------- len1 --------------> tag_table_* <len1> tag_table_id 88 <nodeid in 8 bytes> tag_table_val <len2> <actual data> <-- len2 ---> Now: <--------------- len ---------------> tag_table_* <len> U32 <nodeid in 4 bytes> <actual data>
Previously every auto-serialized tags are strongly typed. However this is not strictly required, and instead it can be exploited to provide the optimal encoding for smaller integers. This commit repurposes `EsI8`/`EsU8` through `EsI64`/`EsU64` tags to represent *any* integers with given ranges: It is now possible to encode `42u64` as two bytes `EsU8 0x2a`, for example. There are some limitations: * It does not apply to non-auto-serialized tags for obvious reasons. Fortunately, we have already eliminated the biggest source of such tag in favor of auto-serialized tags: `tag_table_id`. * Bigger tags cannot be used to represent smaller types. * Signed tags and unsigned tags do not mix.
We have changed the encoding enough to bump that. Also added some notes about metadata encoding to librbml/lib.rs.
3d1facd
to
3647158
Compare
Re-rebased onto b4c965e. I had made a stupid mistake :S Edit: This passes |
EsF32 = 0x0a, // + 4 bytes | ||
EsF64 = 0x0b, // + 8 bytes | ||
EsSub8 = 0x0c, // + 1 byte | ||
EsSub32 = 0x0d, // + 4 bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be 0x0e
, so that EsSub16
can go between 8
and 32
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally tried to do so. But this would mean that we need to encode implicit lengths for tag 0x0d without knowing what it would actually be. I guess we can trade a slight inconsistency for backward compatibility (of sorta). And if we don't value compatibility, we can simply move EsSub32
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were never any compatibility guarantees about metadata (any kind of versioning scheme should be encoded outside of RBML).
Also clarified the mysterious `_next_int` method.
3647158
to
2008b54
Compare
@bors r+ |
@bors: p=1 |
@alexcrichton On the performance: I've built a simple Hello, world program with both nightly and stage2. Performance figures are roughly same, but curiously (or obviously?) the new rustc is both faster (best time being 0.56s vs. 0.62s) and consumes less memory (best max resident being 45.3M vs. 57.9M). I guess it is an impact of decoder changes plus metadata reduction, but would never be sure. I've also run |
⌛ Testing commit 2008b54 with merge 2e0e16e... |
⌛ Testing commit 2008b54 with merge 24a840d... |
This is a series of individual but correlated changes to the metadata format. The changes are significant enough that it (finally) bumps the metadata encoding version. In brief, they altogether reduce the total size of stage1 binaries by 27% (!!!!). Almost every low-hanging fruit has been considered and fixed; see the individual commits for details. Detailed library (not just metadata) size changes for x86_64-unknown-linux-gnu stage1 binaries (baseline being 3a96d6a): ```` before after delta path --------- --------- ------ -------------------------------- 1706146 1050412 38.4% liballoc-4e7c5e5c.rlib 398576 152454 61.8% libarena-4e7c5e5c.rlib 71441 56892 20.4% libarena-4e7c5e5c.so 14424754 5084102 64.8% libcollections-4e7c5e5c.rlib 39143186 14743118 62.3% libcore-4e7c5e5c.rlib 195574 188150 3.8% libflate-4e7c5e5c.rlib 153123 152603 0.3% libflate-4e7c5e5c.so 477152 215262 54.9% libfmt_macros-4e7c5e5c.rlib 77728 66601 14.3% libfmt_macros-4e7c5e5c.so 1216936 684104 43.8% libgetopts-4e7c5e5c.rlib 207846 181116 12.9% libgetopts-4e7c5e5c.so 349722 147530 57.8% libgraphviz-4e7c5e5c.rlib 60196 49197 18.3% libgraphviz-4e7c5e5c.so 729842 259906 64.4% liblibc-4e7c5e5c.rlib 349358 247014 29.3% liblog-4e7c5e5c.rlib 88878 83163 6.4% liblog-4e7c5e5c.so 1968508 732840 62.8% librand-4e7c5e5c.rlib 1968204 696326 64.6% librbml-4e7c5e5c.rlib 283207 206589 27.1% librbml-4e7c5e5c.so 72369394 46401230 35.9% librustc-4e7c5e5c.rlib 11941372 10498483 12.1% librustc-4e7c5e5c.so 2717894 1983272 27.0% librustc_back-4e7c5e5c.rlib 501900 464176 7.5% librustc_back-4e7c5e5c.so 15058 12588 16.4% librustc_bitflags-4e7c5e5c.rlib 4008268 2961912 26.1% librustc_borrowck-4e7c5e5c.rlib 837550 785633 6.2% librustc_borrowck-4e7c5e5c.so 6473348 6095470 5.8% librustc_driver-4e7c5e5c.rlib 1448785 1433945 1.0% librustc_driver-4e7c5e5c.so 95483688 94779704 0.7% librustc_llvm-4e7c5e5c.rlib 43516815 43487809 0.1% librustc_llvm-4e7c5e5c.so 938140 817236 12.9% librustc_privacy-4e7c5e5c.rlib 182653 176563 3.3% librustc_privacy-4e7c5e5c.so 4390288 3543284 19.3% librustc_resolve-4e7c5e5c.rlib 872981 831824 4.7% librustc_resolve-4e7c5e5c.so 1817642 14795426 18.6% librustc_trans-4e7c5e5c.rlib 3657354 3480026 4.8% librustc_trans-4e7c5e5c.so 16815076 13868862 17.5% librustc_typeck-4e7c5e5c.rlib 3274439 3123898 4.6% librustc_typeck-4e7c5e5c.so 21372308 14890582 30.3% librustdoc-4e7c5e5c.rlib 4501971 4172202 7.3% librustdoc-4e7c5e5c.so 8055028 2951044 63.4% libserialize-4e7c5e5c.rlib 958101 710016 25.9% libserialize-4e7c5e5c.so 30810208 15160648 50.8% libstd-4e7c5e5c.rlib 6819003 5967485 12.5% libstd-4e7c5e5c.so 58850950 31949594 45.7% libsyntax-4e7c5e5c.rlib 9060154 7882423 13.0% libsyntax-4e7c5e5c.so 1474310 1062102 28.0% libterm-4e7c5e5c.rlib 345577 323952 6.3% libterm-4e7c5e5c.so 2827854 1643056 41.9% libtest-4e7c5e5c.rlib 517811 452519 12.6% libtest-4e7c5e5c.so 2274106 1761240 22.6% libunicode-4e7c5e5c.rlib --------- --------- ------ -------------------------------- 499359187 363465583 27.2% total ```` Some notes: * Uncompressed metadata compacts very well. It is less visible for compressed metadata but still it achieves about 5~10% reduction. * *Every* commit is designed to reduce the metadata in one way. There is absolutely no negative impact associated to changes (that's why the table above doesn't contain a minus delta). * I've confirmed that this compiles through `make all`, making it almost correct. Other platforms have to be tested though. * Oh, I'll rebase this as soon as I have spare time, but I guess this needs an extensive review anyway. * I haven't rigorously checked the encoder and decoder performance. I tried to minimize the impact (some encodings are actually simpler than the original), but I'm not sure. Fixes #2743, #9303 (partially) and #21482.
@lifthrasiir You are a hero. |
This is a series of individual but correlated changes to the metadata format. The changes are significant enough that it (finally) bumps the metadata encoding version. In brief, they altogether reduce the total size of stage1 binaries by 27% (!!!!). Almost every low-hanging fruit has been considered and fixed; see the individual commits for details.
Detailed library (not just metadata) size changes for x86_64-unknown-linux-gnu stage1 binaries (baseline being 3a96d6a):
Some notes:
make all
, making it almost correct. Other platforms have to be tested though.Fixes #2743, #9303 (partially) and #21482.