Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable cache for EXT:crawler indexing requests #428

Closed
kraemer-igroup opened this issue Jan 24, 2025 · 2 comments · Fixed by #429
Closed

Disable cache for EXT:crawler indexing requests #428

kraemer-igroup opened this issue Jan 24, 2025 · 2 comments · Fixed by #429
Labels

Comments

@kraemer-igroup
Copy link
Contributor

Hi @lochmueller, like Solr, indexing for indexed_search via EXT:crawler is not working properly in combination with the staticfilecache extension (see tomasnorre/crawler#1029).

For Solr this was fixed within these issues:
#200
#279

Could we also disable the cache for EXT:crawler indexing requests for indexed_search? I could provide a pull request.
EXT:crawler uses "X-T3Crawler" as header:
/~https://github.com/tomasnorre/crawler/blob/main/Classes/CrawlStrategy/GuzzleExecutionStrategy.php#L99

@lochmueller
Copy link
Owner

Hey @kraemer-igroup

thanks for the note. Could you describe the problem more detailed? You mean the reindex is not working, because EXT:crawler get the SFC Cache entry with a real http request? So you want to add the Crawler header to the htaccess configuration?

Feel free to send a PR.

Regards,
Tim

@kraemer-igroup
Copy link
Contributor Author

Hey @lochmueller,

yes, sorry, the crawler request seems to get the SFC entry, which does not invoke the needed middlewares to generate the index entry. I tried do workaround it by adding this into my htaccess file:

# Disable cache for EXT:crawler indexing requests
RewriteCond %{HTTP:X-T3Crawler} .+
RewriteRule .* - [E=SFC_HOST:invalid-host]

But this only works, if the option useFallbackMiddleware is disabled. The pages/news are getting indexed again despite having a SFC entry. If I also do something like this with "X-T3Crawler" instead of "X-Tx-Solr-Iq", then the indexing works again, even with the useFallbackMiddleware option enabled: 756048b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants