Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Privacy plugin crashes on HTTP errors #8012

Closed
4 tasks done
Lucas-C opened this issue Feb 18, 2025 · 10 comments
Closed
4 tasks done

Privacy plugin crashes on HTTP errors #8012

Lucas-C opened this issue Feb 18, 2025 · 10 comments
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open

Comments

@Lucas-C
Copy link
Contributor

Lucas-C commented Feb 18, 2025

Context

I am the maintainer of fpdf2 and use your excellent theme for our documentation: https://py-pdf.github.io/fpdf2/

I enabled the privacy plugin yesterday.

Bug description

In our GitHub Actions build pipeline, I see frequent (but not systematic) failures like this:
/~https://github.com/py-pdf/fpdf2/actions/runs/13388083737/job/37389293893?pr=1366

10:08:50 [mkdocs.material.privacy] Downloading external file: https://api.star-history.com/svg?repos=py-pdf/fpdf2
10:08:51 [mkdocs.commands.build] Error reading page 'index.md': 'content-type'
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.13.2/x64/bin/mkdocs", line 8, in <module>
    sys.exit(cli())
             ~~~^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
           ~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/mkdocs/__main__.py", line 288, in build_command
    build.build(cfg, dirty=not clean)
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/mkdocs/commands/build.py", line 310, in build
    _populate_page(file.page, config, files, dirty)
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/mkdocs/commands/build.py", line 171, in _populate_page
    page.content = config.plugins.on_page_content(
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        page.content, page=page, config=config, files=files
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/mkdocs/plugins.py", line 638, in on_page_content
    return self.run_event('page_content', html, page=page, config=config, files=files)
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/mkdocs/plugins.py", line 566, in run_event
    result = method(item, **kwargs)
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/material/plugins/privacy/plugin.py", line 148, in on_page_content
    self._queue(url, config, concurrent = True)
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/material/plugins/privacy/plugin.py", line 383, in _queue
    self._fetch(file, config)
    ~~~~~~~~~~~^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/material/plugins/privacy/plugin.py", line 421, in _fetch
    mime = res.headers["content-type"].split(";")[0]
           ~~~~~~~~~~~^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.13.2/x64/lib/python3.13/site-packages/requests/structures.py", line 52, in __getitem__
    return self._store[key.lower()][1]
           ~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: 'content-

On other executions, it sometimes work fine:
/~https://github.com/py-pdf/fpdf2/actions/runs/13387513920/job/37389309336 (retried, the 1st execution failed)

10:09:02 [mkdocs.material.privacy] Downloading external file: https://api.star-history.com/svg?repos=py-pdf/fpdf2

All is fine, but we can see that the with-pdf plugin has trouble downloading this image:

ERROR:weasyprint:Failed to load image at 'https://api.star-history.com/svg?repos=py-pdf/fpdf2': TimeoutError: The read operation timed out
ERROR:weasyprint:Failed to load image at 'data:application/pdf;base64,JVBERi0xLjMKMyAwIG9iago8PC9UeXBlIC9QYWdlCi9QYXJlbnQgMSAwIFIKL1Jlc291cmNlcyAyIDAgUgovQ29udGVudHMgNCAwIFI+PgplbmRvYmoKNCAwIG9iago8PC9GaWx0ZXIgL0ZsYXRlRGVjb2RlIC9MZW5ndGggNzM+PgpzdHJlYW0KeJwzUvDiMtAzNVco53IKUdB3M1QwsdAzMFAISVNwDQEJGRvqGVoomJub6hmaKISkKGhkpObk5CuU5xflpGgqhGSBlAEAC64QcgplbmRzdHJlYW0KZW5kb2JqCjEgMCBvYmoKPDwvVHlwZSAvUGFnZXMKL0tpZHMgWzMgMCBSXQovQ291bnQgMQovTWVkaWFCb3ggWzAgMCA1OTUuMjggODQxLjg5XQo+PgplbmRvYmoKNSAwIG9iago8PC9UeXBlIC9Gb250Ci9CYXNlRm9udCAvSGVsdmV0aWNhCi9TdWJ0eXBlIC9UeXBlMQovRW5jb2RpbmcgL1dpbkFuc2lFbmNvZGluZwo+PgplbmRvYmoKMiAwIG9iago8PAovUHJvY1NldCBbL1BERiAvVGV4dCAvSW1hZ2VCIC9JbWFnZUMgL0ltYWdlSV0KL0ZvbnQgPDwKL0YxIDUgMCBSCj4+Ci9YT2JqZWN0IDw8Cj4+Cj4+CmVuZG9iago2IDAgb2JqCjw8Ci9DcmVhdGlvbkRhdGUgKEQ6MjAyMjA5MTUwNjU0NDJaMDYnNTQnKQo+PgplbmRvYmoKNyAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMSAwIFIKL09wZW5BY3Rpb24gWzMgMCBSIC9GaXRIIG51bGxdCi9QYWdlTGF5b3V0IC9PbmVDb2x1bW4KPj4KZW5kb2JqCnhyZWYKMCA4CjAwMDAwMDAwMDAgNjU1MzUgZiAKMDAwMDAwMD

I suspect that in some cases the HTTP retrieval of the SVG fails, but the plugin still tries to parse the headers ftom the HTTP response, they do not contain any Content-Type.

In that case, I would expect the privacy plugin to skip that <img> and preserve it as it is.

Related links

Reproduction

9.6.4-privacy-plugin-frequent-keyerror-content-type.zip

Steps to reproduce

Repeatedly call mkdoc build until it fails.

I could not reproduce the failure on my own computer, it only frequently fails in GitHub Actions pipelines.

If this disqualify this bug report, feel free to close it.

Browser

No response

Before submitting

@Lucas-C
Copy link
Contributor Author

Lucas-C commented Feb 18, 2025

Potential fix: handle the case of the HTTP failure before trying to parse the Content-Type header there:
/~https://github.com/squidfunk/mkdocs-material/blob/9.6.4/src/plugins/privacy/plugin.py#L407

@squidfunk
Copy link
Owner

Thanks for reporting! PR appreciated ☺️ I thought that request raises an exception and terminates, but I seem to be wrong, thus if you have a quick fix for this, we're happy to include it.

@squidfunk squidfunk added the bug Issue reports a bug label Feb 18, 2025
@Lucas-C
Copy link
Contributor Author

Lucas-C commented Feb 18, 2025

response.raise_for_status() can be called to raise an exception in case of a HTTP code that is not in the range [200,400[:
https://3.python-requests.org/user/quickstart/#response-status-codes

I'd be happy to submit a PR, but would that be OK to raise an exception in that method?
Will it be catched properly from the calling code?

@squidfunk
Copy link
Owner

Yes, of course – as said, I assumed that's done, but thinking about it, a 404 is not an error HTTP request-wise. Thus, since requests follow redirects, we can check if the code is >= 400 (or if no body is present?), raise an error with the HTTP error, which should be caught by the thread pool executor.

We need to check what we do with the error – maybe just print it as an warning, and continue without replacing? Not sure. This would allow users that do a --strict build to fail the build. Otherwise, I'm happy to discuss other ideas.

@squidfunk
Copy link
Owner

squidfunk commented Feb 18, 2025

Ok, I think the errors are currently not caught, but we could expand the respective wait reconciliations to handle them:

wait(self.pool_jobs)

wait(self.pool_jobs)

wait(self.pool_jobs)

We should add another method that doesn't just wait, but prints the errors. Also, further testing is necessary to check, that the paths are not replaced when the requests failed.

@squidfunk squidfunk changed the title [Privacy plugin] Frequent KeyError: 'content-type' not being properly handled Privacy plugin crashes on HTTP errors Feb 18, 2025
Lucas-C pushed a commit to Lucas-C/mkdocs-material that referenced this issue Feb 18, 2025
Lucas-C pushed a commit to Lucas-C/mkdocs-material that referenced this issue Feb 18, 2025
@Lucas-C
Copy link
Contributor Author

Lucas-C commented Feb 18, 2025

I opened PR #8015

It seems that you do not have any unit tests, so I did not add any.

I'd be happy to have your review on this PR 🙂

We should add another method that doesn't just wait, but prints the errors. Also, further testing is necessary to check, that the paths are not replaced when the requests failed.

I tested that and everything seems fine.

Lucas-C pushed a commit to Lucas-C/mkdocs-material that referenced this issue Feb 18, 2025
Lucas-C pushed a commit to Lucas-C/mkdocs-material that referenced this issue Feb 18, 2025
@squidfunk
Copy link
Owner

Thanks! I'll look into it.

It seems that you do not have any unit tests, so I did not add any.

Jup, which historically comes from this project starting out as a theme only 😅 We're in the process of fixing this as part of the foundational work we're currently focusing on, and add proper unit and integration tests.

Lucas-C pushed a commit to Lucas-C/mkdocs-material that referenced this issue Feb 19, 2025
Lucas-C pushed a commit to Lucas-C/mkdocs-material that referenced this issue Feb 19, 2025
@squidfunk squidfunk added the resolved Issue is resolved, yet unreleased if open label Feb 20, 2025
@squidfunk
Copy link
Owner

Fixed in 2e837fa by @Lucas-C via #8012.

@squidfunk
Copy link
Owner

Released as part of 9.6.5.

@Lucas-C
Copy link
Contributor Author

Lucas-C commented Feb 20, 2025

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue reports a bug resolved Issue is resolved, yet unreleased if open
Projects
None yet
Development

No branches or pull requests

2 participants