Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate the HTML output from all component examples #2828

Merged
merged 22 commits into from
Sep 13, 2022
Merged

Conversation

36degrees
Copy link
Contributor

This adds a dependency on HTML Validate, an offline HTML validator, which we can use in our tests to validate the HTML output from every component example.

After adding tests for every component example, I disabled every rule that was failing until the tests were passing, and then. taking each rule in turn either:

  • fixed the violations
  • tweaked the rule config or the element definitions so that the rule passed whilst still catching other examples
  • left the rule disabled and added a comment to explain why we should keep it disabled

I'd recommend reviewing commit-by-commit.

This fixes an issue where tests that run after calls to Nunjucks.configure can fail, because calls to `nunjucks.configure` affect any calls to `nunjucks.render` that happen afterwards.

From the Nunjucks documentation [1]:

> Warning: The simple API (above; e.g. `nunjucks.render`) always uses the configuration from the most recent call to nunjucks.configure. Since this is implicit and can result in unexpected side effects, use of the simple API is discouraged in most cases (especially if configure is used); instead, explicitly create an environment using `var env = nunjucks.configure(...)` and then call `env.render(...)` etc.

Follow their suggested best practise by explicitly creating an environment and using the environment to render the templates to HTML.

[1]: https://mozilla.github.io/nunjucks/api#configure
Use the html-validate package and jest helpers to validate all of the examples from each component.
@govuk-design-system-ci govuk-design-system-ci temporarily deployed to govuk-frontend-pr-2828 September 5, 2022 13:19 Inactive
When rendering components to HTML we're create an HTML 'fragment' rather than an entire document. We then load the HTML into Cheerio so that we can interrogate the DOM and make assertions about the makeup of the component.

By default Cheerio will turn the fragment into a document [1]:

> Note that similar to web browser contexts, this operation may introduce `<html>`, `<head>`, and `<body>` elements; set `isDocument` to `false` to switch to fragment mode and disable this.

The document that gets created includes some issues which then get flagged by the validator:

> Expected HTML to be valid but had the following errors:
>
>      <html> is missing required "lang" attribute [element-required-attributes]
>      <head> element must have <title> as content [element-required-content]

Set `isDocument` (the third argument) to false when calling load to disable this behaviour so that we're validating only the HTML that comes from the component.

[1]: https://cheerio.js.org/functions/load.html#load
Add missing href options to the 'current' pages in the pagination component to avoid empty href attributes on the links

Override the html-validate config to allow `textarea` elements to have an `autocomplete` attribute of `street-address`, which html-validate doesn't allow for [1] but the spec definitely does [2].

[1]: https://gitlab.com/html-validate/html-validate/-/blob/52034219ca24d2d1f0e90187ec5dd40c7ca6d30f/elements/html5.js#L1675
[2]: https://html.spec.whatwg.org/dev/form-control-infrastructure.html#autofilling-form-controls:-the-autocomplete-attribute
@govuk-design-system-ci govuk-design-system-ci temporarily deployed to govuk-frontend-pr-2828 September 5, 2022 13:26 Inactive
const $ = render(component, data)

expect($.html()).toHTMLValidate({
rules: {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config could all go in to a .htmlvalidate.js in the root of the repo, but my thinking is whilst we're only using this in one place we might as well keep it all together?

@govuk-design-system-ci govuk-design-system-ci temporarily deployed to govuk-frontend-pr-2828 September 5, 2022 16:29 Inactive
@@ -37,7 +37,7 @@ treat it as an interactive element - without this it will be

{#- Define common attributes we can use for both button and input types #}

{%- set buttonAttributes %}{% if params.name %} name="{{ params.name }}"{% endif %}{% if params.disabled %} disabled="disabled" aria-disabled="true"{% endif %}{% if params.preventDoubleClick %} data-prevent-double-click="true"{% endif %}{% endset %}
{%- set buttonAttributes %}{% if params.name %} name="{{ params.name }}"{% endif %}{% if params.disabled %} disabled aria-disabled="true"{% endif %}{% if params.preventDoubleClick %} data-prevent-double-click="true"{% endif %}{% endset %}
Copy link
Contributor

@NickColley NickColley Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially breaking change

button.disabled will still be true
but button.getAttribute('disabled') will change from "disabled" to an empty string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch Nick, thanks.

I'm going to split this change out into a follow up PR as it's the only change to the HTML output of one of the macros.

We're still discussing as a team how we want to treat the change, but at the very least it should go in the release notes, which will be easier once it's split out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force-pushed to replace the changes in fcd04cd with 500a02f.

This is an opinionated rule that enforces omitting the value for boolean attributes – so `<button disabled>` rather than `<button disabled="disabled">`.

However, according to the HTML standard either is fine.

At the minute we're inconsistent on this. Radios, checkboxes and select all omit the value (`disabled`), but buttons use `disabled="disabled"`.

We want to avoid making changes to the HTML output from a macro (unless it is invalid HTML) for now, but will consider fixing the inconsistency and re-enabling this rule in the future.
Override the html-validate config to allow buttons to omit the `type` attribute (if omitted, the button type defaults to 'submit' [1])

[1]: https://html.spec.whatwg.org/multipage/form-elements.html#attr-button-type
This fails the input's "with pattern attribute" example with:

> Attribute "pattern" is not allowed on <input type="number">

For now we want to support this combination, and there's no way to customise the rule to prevent it from failing.
From the HTML Validate docs [1]:

> By default, this rule enforces that ID begins with a letter and that only letters, numbers, underscores and dashes are used but this can be disabled with the "relaxed" option to only validate the strict definition from the standard.

[1]: https://html-validate.org/rules/valid-id.html
The skip link doesn't have default for text, so when neither text nor html are passed you end up with a link with no text which violates this rule:

> Anchor link must have a text describing its purpose [wcag/h30]
Some of our examples have a fieldset with no legend, which violates this rule:

> <fieldset> must have a <legend> as the first child [wcag/h71]

Add a legend to all examples that use a fieldset and are currently missing a legend.
@govuk-design-system-ci govuk-design-system-ci temporarily deployed to govuk-frontend-pr-2828 September 7, 2022 10:57 Inactive
@36degrees 36degrees requested a review from a team September 7, 2022 10:59
The render function renders the Nunjucks to HTML and then loads it in to a Cheerio object, which involves a parsing step.

When we ask Cheerio to give us the HTML back, it gives us a representation of the DOM that it built when it parsed the HTML. Any malformed or mismatched tags, etc that cannot be represented in the DOM will have been handled in the same way that a browser would.

Instead, validate the HTML exactly as it is rendered by the macro, skipping the extra parsing stage, and ensuring that we catch things like malformed tags which would otherwise be hidden from the validator.
@govuk-design-system-ci govuk-design-system-ci temporarily deployed to govuk-frontend-pr-2828 September 8, 2022 09:29 Inactive
@36degrees 36degrees added this to the [NEXT] milestone Sep 12, 2022
Copy link
Member

@romaricpascal romaricpascal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the idea to validate the HTML we output 🙌🏻 Just caught a few details in the JSDocs that could be tweaked, but if we don't want to go that nitpicky, feel free to leave them out 😄

lib/jest-helpers.js Outdated Show resolved Hide resolved
lib/jest-helpers.js Outdated Show resolved Hide resolved
lib/jest-helpers.js Outdated Show resolved Hide resolved
@govuk-design-system-ci govuk-design-system-ci temporarily deployed to govuk-frontend-pr-2828 September 12, 2022 15:11 Inactive
36degrees and others added 2 commits September 12, 2022 16:13
Co-authored-by: Romaric Pascal <romaric.pascal@digital.cabinet-office.gov.uk>
Fixes an issue flagged by the HTML validator with one of the fixtures for the tabs component:

> Attribute "tabIndex" should be lowercase [attr-case]
@govuk-design-system-ci govuk-design-system-ci temporarily deployed to govuk-frontend-pr-2828 September 12, 2022 15:13 Inactive
@36degrees 36degrees merged commit faefdf3 into main Sep 13, 2022
@36degrees 36degrees deleted the html-validator branch September 13, 2022 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

4 participants