Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc-json: change item ID's repr from a string to an int #130078

Merged
merged 1 commit into from
Oct 12, 2024

Conversation

its-the-shrimp
Copy link
Contributor

@its-the-shrimp its-the-shrimp commented Sep 7, 2024

Following this discussion on Zulip, I've changed the repr of rustdoc_json_types::Id from a String to a u32, by adding a clean::ItemId interner to JsonRenderer

r? @aDotInTheVoid

@rustbot rustbot added A-rustdoc-json Area: Rustdoc JSON backend S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Sep 7, 2024
@rustbot
Copy link
Collaborator

rustbot commented Sep 7, 2024

rustdoc-json-types is a public (although nightly-only) API. If possible, consider changing src/librustdoc/json/conversions.rs; otherwise, make sure you bump the FORMAT_VERSION constant.

cc @CraftSpider, @aDotInTheVoid, @Enselic, @obi1kenobi

Some changes occurred in src/librustdoc/clean/types.rs

cc @camelid

Copy link
Member

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm broadly speaking quite unfamiliar with the librustdoc portion of the codebase, so apologies for my ignorant questions here.

src/librustdoc/clean/types.rs Outdated Show resolved Hide resolved
// FIXME(aDotInTheVoid): Consider making this non-public in rustdoc-types.
pub struct Id(pub String);
pub struct Id(pub u32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to recover the string ID from the numeric ID given a rustdoc JSON file? Or are you confident that isn't a capability that would be useful in any use case, so it isn't necessary to add?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean "recover the string ID"? If you want to find what the Id would have been under the old schem, then no, this won't be possible, but I'm fine with that.

Rustdoc makes no guarantees about the inner value of Id’s. Applications should treat them as opaque keys to lookup items, and avoid attempting to parse them, or otherwise depend on any implementation details.

@rust-log-analyzer

This comment has been minimized.

Copy link
Member

@aDotInTheVoid aDotInTheVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR's doing alot, and needs major rework, so I've only tried to review it at a high level, as I don't think the in-the-weeds stuff is worth it yet.

  1. The changes to ItemId are intrusive to most of the rustdoc codebase, and not motivated at all. If these are nessessary, could that be explained, and send in a different, tighter-scoped, PR (that would be landed before this)
  2. Making Cache be the only source of state for the clean->json lowering is quite a bad idea:
    • It requires adding a TyCtxt, which produces more changes to the rest of rustdoc.
    • It adds the interner field which isn't used for HTML, but is still present, which is unfortunate.
    • In present, it requires adding a Lock, but I don't think this is inherent.
  3. There seem to be unrelated cleanups (like the removal of FormatRenderer::cache). These are should be merged into rustdoc, but not here! Again, the answer is a smaller, seperate PR, that just does one thing, and will be much easier to review.

src/librustdoc/formats/cache.rs Outdated Show resolved Hide resolved
src/librustdoc/formats/cache.rs Outdated Show resolved Hide resolved
src/librustdoc/json/conversions.rs Outdated Show resolved Hide resolved
src/librustdoc/clean/types.rs Outdated Show resolved Hide resolved
@aDotInTheVoid aDotInTheVoid added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 8, 2024
@its-the-shrimp its-the-shrimp force-pushed the rustdoc-types-compress-ids branch from 60b7fab to f32c35a Compare September 9, 2024 08:24
@rust-log-analyzer

This comment has been minimized.

@jalil-salame
Copy link
Contributor

Some tests are failing to compile due to them expecting the id to be Id(String): /~https://github.com/rust-lang/rust/actions/runs/10769404806/job/29860558315?pr=130078#step:25:6101

@bors
Copy link
Contributor

bors commented Sep 9, 2024

☔ The latest upstream changes (presumably #130133) made this pull request unmergeable. Please resolve the merge conflicts.

@its-the-shrimp its-the-shrimp force-pushed the rustdoc-types-compress-ids branch 3 times, most recently from 10aae07 to c8bf3ef Compare September 10, 2024 11:31
@its-the-shrimp
Copy link
Contributor Author

@rustbot ready

This time the interner is restricted to the JsonRenderer, however I still couldn't get rid of the lock on the interner because of borrow checking errors

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 10, 2024
@its-the-shrimp
Copy link
Contributor Author

its-the-shrimp commented Sep 10, 2024

It could be possible to get rid of it if we were to change the way formats::run_format drives the renderer forward, currently it clones the renderer at least once (and for JSON it'll do so only once), which means the ID interner has to be in an Rc, which is unfortunate since that clone is completely redundant (I'll probably do that after this commit is merged)

@its-the-shrimp
Copy link
Contributor Author

Or I can do it in this PR, the change should be very minimal

@its-the-shrimp
Copy link
Contributor Author

should the FORMAT_VERSION be bumped for this?

@jalil-salame
Copy link
Contributor

should the FORMAT_VERSION be bumped for this?

Most likely yes, it is an API change (rustdoc-types will be unable to parse files with Id(u32)).

@its-the-shrimp
Copy link
Contributor Author

should the FORMAT_VERSION be bumped for this?

Most likely yes, it is an API change (rustdoc-types will be unable to parse files with Id(u32)).

The representation of IDs doesn't have any guarantees associated with it though, and rustdoc-types can get a new major version without incrementing FORMAT_VERSION

@obi1kenobi
Copy link
Member

obi1kenobi commented Sep 11, 2024 via email

@its-the-shrimp its-the-shrimp force-pushed the rustdoc-types-compress-ids branch from c8bf3ef to 507bd42 Compare September 11, 2024 21:33
@aDotInTheVoid
Copy link
Member

Yeah, bumping the format version needs to happen whenever we do something that breaks deserialisation (and I don’t know any times it’s been done for other reasons, but would be interested).

The representation of IDs doesn't have any guarantees associated with it though

Previously we guaranteed they were strings.

@bors
Copy link
Contributor

bors commented Sep 23, 2024

☔ The latest upstream changes (presumably #130724) made this pull request unmergeable. Please resolve the merge conflicts.

@its-the-shrimp its-the-shrimp force-pushed the rustdoc-types-compress-ids branch from 507bd42 to 1e30b5a Compare September 29, 2024 21:10
struct FullItemId {
def_id: DefId,
name: Option<Symbol>,
/// Used to distinguish imports of different items with the same name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: This comment could be more useful. The problem it's actually trying to solve is that a single pub use with 1 defid can expand to multiple rustdoc-json items it the name exists in multiple namespaces: https://godbolt.org/z/jEnGxoWx6.

At some point in the future, we should maybe change how we do this on the JSON side, but not in this PR.

@aDotInTheVoid
Copy link
Member

@bors r+

Thanks for working on this, and sorry for the delay.

@bors
Copy link
Contributor

bors commented Oct 11, 2024

📌 Commit 1e30b5a has been approved by aDotInTheVoid

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 11, 2024
tgross35 added a commit to tgross35/rust that referenced this pull request Oct 11, 2024
…ess-ids, r=aDotInTheVoid

rustdoc-json: change item ID's repr from a string to an int

Following [this discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/266220-t-rustdoc/topic/Optimizing.20the.20.60Id.60.20type.20in.20.60rustdoc-types.60), I've changed the repr of `rustdoc_json_types::Id` from a String to a u32, by adding a `clean::ItemId` interner to `JsonRenderer`

r? `@aDotInTheVoid`
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 11, 2024
Rollup of 7 pull requests

Successful merges:

 - rust-lang#130078 (rustdoc-json: change item ID's repr from a string to an int)
 - rust-lang#131065 (Port sort-research-rs test suite to Rust stdlib tests)
 - rust-lang#131109 (Stabilize `debug_more_non_exhaustive`)
 - rust-lang#131287 (stabilize const_result)
 - rust-lang#131463 (Stabilise `const_char_encode_utf8`.)
 - rust-lang#131543 (coverage: Remove code related to LLVM 17)
 - rust-lang#131552 (RustWrapper: adapt for rename of Intrinsic::getDeclaration)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 05c0591 into rust-lang:master Oct 12, 2024
6 checks passed
@rustbot rustbot added this to the 1.83.0 milestone Oct 12, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Oct 12, 2024
Rollup merge of rust-lang#130078 - its-the-shrimp:rustdoc-types-compress-ids, r=aDotInTheVoid

rustdoc-json: change item ID's repr from a string to an int

Following [this discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/266220-t-rustdoc/topic/Optimizing.20the.20.60Id.60.20type.20in.20.60rustdoc-types.60), I've changed the repr of `rustdoc_json_types::Id` from a String to a u32, by adding a `clean::ItemId` interner to `JsonRenderer`

r? ``@aDotInTheVoid``
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rustdoc-json Area: Rustdoc JSON backend S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants