Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept additional user-defined classes in fenced code blocks #79454

Closed

Conversation

poliorcetics
Copy link
Contributor

@poliorcetics poliorcetics commented Nov 26, 2020

Fix #78917.

This allow users to write the following:

/// Some code block with `rust,class:test:name` as the language string
///
/// ```rust,class:test:name
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}

This will notably produce the following HTML:

<pre class="rust test:name">
int main(void) {
    return 0;
}</pre>

Instead of:

<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
    <span class="kw">return</span> <span class="number">0</span>;
}
</pre>

TODO:

MERGE:

@rust-highfive
Copy link
Collaborator

r? @jyn514

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 26, 2020
@poliorcetics

This comment has been minimized.

@rustbot rustbot added A-markdown-parsing Area: Markdown parsing for doc-comments T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Nov 26, 2020
@jyn514
Copy link
Member

jyn514 commented Nov 27, 2020

I'm not sure I understand what this is actually getting us. There's no more rust syntax highlighting, yes - but there's no C source highlighting either, which was kind of the point. And the C highlighting can't use the same mechanism, because rust highlighting uses rustc_lexer directly and I do not want to tie rustdoc into clang, nor have to do something similar for each language. That means highlight.js or something like it, but that seems somewhat complicated ...

Anyway, I think we should have this feature actually do something before we accept the syntax, especially since the new feature is insta-stable currently.

@jyn514
Copy link
Member

jyn514 commented Nov 27, 2020

cc @GuillaumeGomez - do you have opinons on highlight.js?

@Hywan
Copy link
Contributor

Hywan commented Nov 27, 2020

I think it's better to not force a specific Javascript library to highlight the code. One may want highlight.js while another one may prefer prism.js. The primary goal of this PR is to allow passing any HTML class to the code block and to disable the Rust highlighting in this specific case.

@GuillaumeGomez
Copy link
Member

I'm not sure I understand what this is actually getting us. There's no more rust syntax highlighting, yes - but there's no C source highlighting either, which was kind of the point. And the C highlighting can't use the same mechanism, because rust highlighting uses rustc_lexer directly and I do not want to tie rustdoc into clang, nor have to do something similar for each language. That means highlight.js or something like it, but that seems somewhat complicated ...

That was never the goal. The point here is to allow users to add a class themselves and disable the rustdoc highlighting so that they can then add a JS library which will do it (like highlight.js or prism.js as mentionned by @Hywan).

Anyway, I think we should have this feature actually do something before we accept the syntax, especially since the new feature is insta-stable currently.

We can make it remain unstable by adding a check when parsing the codeblocks' tags for the moment. I think it would be preferrable in fact. Users will still be able to use it on docs.rs if they want to and that will allow to make breaking changes if needed.

@poliorcetics Can you add the check for nightly and only allow this feature then? (And otherwise, maybe print a warning saying it's nightly only for the moment?)

@jyn514
Copy link
Member

jyn514 commented Nov 27, 2020

That was never the goal. The point here is to allow users to add a class themselves and disable the rustdoc highlighting so that they can then add a JS library which will do it.

Ahhh, that makes way more sense. Ok, I'm on board with this then.

@poliorcetics Can you add the check for nightly and only allow this feature then? (And otherwise, maybe print a warning saying it's nightly only for the moment?)

This should go through the normal feature gate mechanism, which gives an error when you try to use nightly features on stable.

Another thing is that I think this will change how language strings are parsed, so it should probably wait for #78429.

@poliorcetics
Copy link
Contributor Author

This should go through the normal feature gate mechanism, which gives an error when you try to use nightly features on stable.

I don't know how those work for rustdoc, is there some place can I find the information or an example ?

@GuillaumeGomez
Copy link
Member

@poliorcetics It's the same as for the rest of the compiler. You can see everything that you need in this PR for example (which is removing the feature gate part since it's stabilizing but you got the idea).

@jyn514 jyn514 added S-blocked Status: Blocked on something else such as an RFC or other implementation work. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 27, 2020
@poliorcetics
Copy link
Contributor Author

With those two commits there are now tests for this ! Still a lot of things to do though

@camelid
Copy link
Member

camelid commented Nov 28, 2020

I saw on Zulip that you were going to call this feature user_classes_in_docs. Might custom_code_classes (maybe add docs somewhere, not sure if it's necessary) be a clearer name?

@poliorcetics
Copy link
Contributor Author

It can easily be renamed as long as this isn't merged. I used a name because I needed one to proceed but I'm not particularly attached to it

@camelid
Copy link
Member

camelid commented Nov 28, 2020

tidy error: Found 1 features without a gate test.
Expected a gate test for the feature 'custom_code_classes_in_docs'.
Hint: create a failing test file named 'feature-gate-custom_code_classes_in_docs.rs'
      in the 'ui' test suite, with its failures due to
      missing usage of `#![feature(custom_code_classes_in_docs)]`.
Hint: If you already have such a test and don't want to rename it,
      you can also add a // gate-test-custom_code_classes_in_docs line to the test file.

@GuillaumeGomez
Copy link
Member

@poliorcetics Look at the PR I linked above. Everything that I removed in it is was you need to add to have a full feature.

src/librustdoc/passes/check_custom_code_classes.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/check_custom_code_classes.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/check_custom_code_classes.rs Outdated Show resolved Hide resolved
src/tools/tidy/src/features.rs Outdated Show resolved Hide resolved
src/tools/tidy/src/features.rs Outdated Show resolved Hide resolved
@poliorcetics
Copy link
Contributor Author

Barring errors in CI, this should now only need clear up work !

@poliorcetics poliorcetics force-pushed the code-blocks-user-classes branch from 369473d to ec87c49 Compare December 4, 2020 00:00
@poliorcetics
Copy link
Contributor Author

Only the tests to clean up now (#79454 (comment))

@bors

This comment has been minimized.

@camelid
Copy link
Member

camelid commented Feb 26, 2021

No longer blocked! #78429 has been merged. 🎉

@rustbot label: -S-blocked

@rustbot rustbot removed the S-blocked Status: Blocked on something else such as an RFC or other implementation work. label Feb 26, 2021
@poliorcetics poliorcetics force-pushed the code-blocks-user-classes branch from 999605d to 4b83841 Compare March 6, 2021 16:38
@rust-log-analyzer

This comment has been minimized.

@poliorcetics poliorcetics force-pushed the code-blocks-user-classes branch from 4b83841 to 027521f Compare March 6, 2021 16:54
@poliorcetics
Copy link
Contributor Author

poliorcetics commented Mar 6, 2021

@rustbot label: +S-waiting-on-review -S-waiting-on-author

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 6, 2021
@jyn514 jyn514 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 6, 2021
@bors

This comment has been minimized.

This allow users to write the following:

```
/// Some code block with `rust,class:test:name` as the language string:
///
/// ```rust,class:test:name
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}
```

This block will be rendered without any highlighting and will have a class
`test:name` in the resulting CSS.
@poliorcetics poliorcetics force-pushed the code-blocks-user-classes branch from 027521f to 3326af2 Compare March 13, 2021 17:46
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 13, 2021
Copy link
Member

@jyn514 jyn514 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still waiting on #79454 (comment) and #79454 (comment) (although I don't care very much about the second one).

Comment on lines +886 to +887
if class.is_empty() {
seen_other_tags = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what does this do? I would expect class: to give an error (or at least a warning).

@camelid camelid added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 26, 2021
@jyn514 jyn514 removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 4, 2021
@crlf0710
Copy link
Member

@poliorcetics Ping from triage! Could you address the comments above? Thanks!

@jyn514
Copy link
Member

jyn514 commented Apr 26, 2021

Closing due to inactivity.

@jyn514 jyn514 closed this Apr 26, 2021
@jyn514 jyn514 added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 26, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 15, 2023
…n_docs, r=t-rustdoc

Accept additional user-defined syntax classes in fenced code blocks

Part of rust-lang#79483.

This is a re-opening of rust-lang#79454 after a big update/cleanup. I also converted the syntax to pandoc as suggested by `@notriddle:` the idea is to be as compatible as possible with the existing instead of having our own syntax.

## Motivation

From the original issue: rust-lang#78917

> The technique used by `inline-c-rs` can be ported to other languages. It's just super fun to see C code inside Rust documentation that is also tested by `cargo doc`. I'm sure this technique can be used by other languages in the future.

Having custom CSS classes for syntax highlighting will allow tools like `highlight.js` to be used in order to provide highlighting for languages other than Rust while not increasing technical burden on rustdoc.

## What is the feature about?

In short, this PR changes two things, both related to codeblocks in doc comments in Rust documentation:

 * Allow to disable generation of `language-*` CSS classes with the `custom` attribute.
 * Add your own CSS classes to a code block so that you can use other tools to highlight them.

#### The `custom` attribute

Let's start with the new `custom` attribute: it will disable the generation of the `language-*` CSS class on the generated HTML code block. For example:

```rust
/// ```custom,c
/// int main(void) {
///     return 0;
/// }
/// ```
```

The generated HTML code block will not have `class="language-c"` because the `custom` attribute has been set. The `custom` attribute becomes especially useful with the other thing added by this feature: adding your own CSS classes.

#### Adding your own CSS classes

The second part of this feature is to allow users to add CSS classes themselves so that they can then add a JS library which will do it (like `highlight.js` or `prism.js`), allowing to support highlighting for other languages than Rust without increasing burden on rustdoc. To disable the automatic `language-*` CSS class generation, you need to use the `custom` attribute as well.

This allow users to write the following:

```rust
/// Some code block with `{class=language-c}` as the language string.
///
/// ```custom,{class=language-c}
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}
```

This will notably produce the following HTML:

```html
<pre class="language-c">
int main(void) {
    return 0;
}</pre>
```

Instead of:

```html
<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
    <span class="kw">return</span> <span class="number">0</span>;
}
</pre>
```

To be noted, we could have written `{.language-c}` to achieve the same result. `.` and `class=` have the same effect.

One last syntax point: content between parens (`(like this)`) is now considered as comment and is not taken into account at all.

In addition to this, I added an `unknown` field into `LangString` (the parsed code block "attribute") because of cases like this:

```rust
/// ```custom,class:language-c
/// main;
/// ```
pub fn foo() {}
```

Without this `unknown` field, it would generate in the DOM: `<pre class="language-class:language-c language-c">`, which is quite bad. So instead, it now stores all unknown tags into the `unknown` field and use the first one as "language". So in this case, since there is no unknown tag, it'll simply generate `<pre class="language-c">`. I added tests to cover this.

Finally, I added a parser for the codeblock attributes to make it much easier to maintain. It'll be pretty easy to extend.

As to why this syntax for adding attributes was picked: it's [Pandoc's syntax](https://pandoc.org/MANUAL.html#extension-fenced_code_attributes). Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes (from [this comment](rust-lang#110800 (comment))).

## Raised concerns

#### It's not obvious when the `language-*` attribute generation will be added or not.

It is added by default. If you want to disable it, you will need to use the `custom` attribute.

#### Why not using HTML in markdown directly then?

Code examples in most languages are likely to contain `<`, `>`, `&` and `"` characters. These characters [require escaping](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/pre) when written inside the `<pre>` element. Using the \`\`\` code blocks allows rustdoc to take care of escaping, which means doc authors can paste code samples directly without manually converting them to HTML.

cc `@poliorcetics`
r? `@notriddle`
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 16, 2023
…n_docs, r=t-rustdoc

Accept additional user-defined syntax classes in fenced code blocks

Part of rust-lang#79483.

This is a re-opening of rust-lang#79454 after a big update/cleanup. I also converted the syntax to pandoc as suggested by `@notriddle:` the idea is to be as compatible as possible with the existing instead of having our own syntax.

## Motivation

From the original issue: rust-lang#78917

> The technique used by `inline-c-rs` can be ported to other languages. It's just super fun to see C code inside Rust documentation that is also tested by `cargo doc`. I'm sure this technique can be used by other languages in the future.

Having custom CSS classes for syntax highlighting will allow tools like `highlight.js` to be used in order to provide highlighting for languages other than Rust while not increasing technical burden on rustdoc.

## What is the feature about?

In short, this PR changes two things, both related to codeblocks in doc comments in Rust documentation:

 * Allow to disable generation of `language-*` CSS classes with the `custom` attribute.
 * Add your own CSS classes to a code block so that you can use other tools to highlight them.

#### The `custom` attribute

Let's start with the new `custom` attribute: it will disable the generation of the `language-*` CSS class on the generated HTML code block. For example:

```rust
/// ```custom,c
/// int main(void) {
///     return 0;
/// }
/// ```
```

The generated HTML code block will not have `class="language-c"` because the `custom` attribute has been set. The `custom` attribute becomes especially useful with the other thing added by this feature: adding your own CSS classes.

#### Adding your own CSS classes

The second part of this feature is to allow users to add CSS classes themselves so that they can then add a JS library which will do it (like `highlight.js` or `prism.js`), allowing to support highlighting for other languages than Rust without increasing burden on rustdoc. To disable the automatic `language-*` CSS class generation, you need to use the `custom` attribute as well.

This allow users to write the following:

```rust
/// Some code block with `{class=language-c}` as the language string.
///
/// ```custom,{class=language-c}
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}
```

This will notably produce the following HTML:

```html
<pre class="language-c">
int main(void) {
    return 0;
}</pre>
```

Instead of:

```html
<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
    <span class="kw">return</span> <span class="number">0</span>;
}
</pre>
```

To be noted, we could have written `{.language-c}` to achieve the same result. `.` and `class=` have the same effect.

One last syntax point: content between parens (`(like this)`) is now considered as comment and is not taken into account at all.

In addition to this, I added an `unknown` field into `LangString` (the parsed code block "attribute") because of cases like this:

```rust
/// ```custom,class:language-c
/// main;
/// ```
pub fn foo() {}
```

Without this `unknown` field, it would generate in the DOM: `<pre class="language-class:language-c language-c">`, which is quite bad. So instead, it now stores all unknown tags into the `unknown` field and use the first one as "language". So in this case, since there is no unknown tag, it'll simply generate `<pre class="language-c">`. I added tests to cover this.

Finally, I added a parser for the codeblock attributes to make it much easier to maintain. It'll be pretty easy to extend.

As to why this syntax for adding attributes was picked: it's [Pandoc's syntax](https://pandoc.org/MANUAL.html#extension-fenced_code_attributes). Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes (from [this comment](rust-lang#110800 (comment))).

## Raised concerns

#### It's not obvious when the `language-*` attribute generation will be added or not.

It is added by default. If you want to disable it, you will need to use the `custom` attribute.

#### Why not using HTML in markdown directly then?

Code examples in most languages are likely to contain `<`, `>`, `&` and `"` characters. These characters [require escaping](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/pre) when written inside the `<pre>` element. Using the \`\`\` code blocks allows rustdoc to take care of escaping, which means doc authors can paste code samples directly without manually converting them to HTML.

cc `@poliorcetics`
r? `@notriddle`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-markdown-parsing Area: Markdown parsing for doc-comments S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alternative highlighting for Rust code block in rustdoc