Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix XSS vulnerability on search results page #323

Merged
merged 2 commits into from
Apr 11, 2023
Merged

Fix XSS vulnerability on search results page #323

merged 2 commits into from
Apr 11, 2023

Conversation

ChrisBAshton
Copy link
Contributor

@ChrisBAshton ChrisBAshton commented Apr 11, 2023

Pages that are indexed in search results have their entire contents indexed, including any HTML code snippets. These HTML snippets would appear in the search results unsanitised, so it was possible to render arbitrary HTML or run arbitrary scripts:

script being invoked
HTML being rendered

This is a largely theoretical security issue; to exploit it, an attacker would need to find a way of committing malicious code to a page indexed by a site that uses tech-docs-gem (which are typically not editable by untrusted users). Their code would also be limited by the relatively short length that's rendered in the corresponding search result. Nevertheless, the XSS would then be triggerable by visiting a pre-constructed URL (/search/index.html?q=some+search+term), which users could be tricked into clicking on through social engineering.

What’s changed

This commit sanitises the HTML before rendering it to the page. It does so whilst retaining the <mark data-markjs="true"> behaviour that highlights the search term in the result:

sanitised HTML with highlights

I've used jQuery's text() function for sanitisation, as that is the approach used elsewhere in the project (1).

I did consider using native JavaScript (using the same approach as in Mustache 2) to avoid the jQuery dependency, but this itself may contain bugs and would lead to having two sanitisation approaches to maintain, so I opted against it. For future reference, the code in this commit can be swapped out with:

var entityMap = {
  '&': '&amp;',
  '<': '&lt;',
  '>': '&gt;',
  '"': '&quot;',
  "'": '&#39;',
  '/': '&#x2F;',
  '`': '&#x60;',
  '=': '&#x3D;'
};
var sanitizedContent = String(content).replace(/[&<>"'`=\/]/g, function (s) {
  return entityMap[s];
});

Identifying a user need

The look and interactions of the gem are unchanged. This simply addresses a security issue.

@ChrisBAshton ChrisBAshton marked this pull request as ready for review April 11, 2023 07:44
@ChrisBAshton ChrisBAshton requested a review from lfdebrux April 11, 2023 07:56
CHANGELOG.md Outdated Show resolved Hide resolved
Pages that are indexed in search results have their entire
contents indexed, including any HTML code snippets. These HTML
snippets would appear in the search results unsanitised, so it
was possible to render arbitrary HTML or run arbitrary scripts:

> ![script being invoked](https://user-images.githubusercontent.com/5111927/230888935-0367b598-eda7-4f67-afb5-799b41684ee3.png)
> ![HTML being rendered](https://user-images.githubusercontent.com/5111927/230888939-f0056edc-6955-4f10-8aee-c93414b1cb69.png)

This is a largely theoretical security issue; to exploit it, an
attacker would need to find a way of committing malicious code
to a page indexed by a site that uses tech-docs-gem (which are
typically not editable by untrusted users). Their code would
also be limited by the relatively short length that's rendered
in the corresponding search result. Nevertheless, the XSS would
then be triggerable by visiting a pre-constructed URL
(`/search/index.html?q=some+search+term`), which users could be
tricked into clicking on through social engineering.

This commit sanitises the HTML before rendering it to the page.
It does so whilst retaining the `<mark data-markjs="true">`
behaviour that highlights the search term in the result:

> ![sanitised HTML with highlights](https://user-images.githubusercontent.com/5111927/230888944-9aaf4920-cddd-43f9-8ef5-17f15785af73.png)

I've used jQuery's `text()` function for sanitisation, as that is
the approach used elsewhere in the project ([1]).

I did consider using native JavaScript (using the same approach as
in Mustache [2]) to avoid the jQuery dependency, but this itself may
contain bugs and would lead to having two sanitisation approaches to
maintain, so I opted against it. For future reference, the code in
this commit can be swapped out with:

```js
var entityMap = {
  '&': '&amp;',
  '<': '&lt;',
  '>': '&gt;',
  '"': '&quot;',
  "'": '&#39;',
  '/': '&#x2F;',
  '`': '&#x60;',
  '=': '&#x3D;'
};
var sanitizedContent = String(content).replace(/[&<>"'`=\/]/g, function (s) {
  return entityMap[s];
});
```

[1]: /~https://github.com/alphagov/tech-docs-gem/blob/66cc7ab0a06dc2f1fe89de8cba2270fcf46f6466/lib/assets/javascripts/_modules/search.js#L202-L204
[2]: /~https://github.com/janl/mustache.js/blob/972fd2b27a036888acfcb60d6119317744fac7ee/mustache.js#L60-L75
@lfdebrux lfdebrux merged commit a51c705 into main Apr 11, 2023
@lfdebrux lfdebrux deleted the fix-xss branch April 11, 2023 09:24
@lfdebrux lfdebrux mentioned this pull request Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants