This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stop synapse from saving messages in device_inbox for hidden devices. #10097
Stop synapse from saving messages in device_inbox for hidden devices. #10097
Changes from 2 commits
0debae7
e403466
da8520d
af12176
1718981
7066cc2
cd661dd
9258d1f
ac31155
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid this query is quite slow on large deployments (testing it out in a dry run on matrix.org, I had to cancel after 5 minutes). Database migrations such as this block homeserver startup until they're complete, and thus we tend to avoid having long-running ones.
As such, I think this would be better suited as a background update, which run in the background after the server has already started up.
Here's an example of turning an existing database migration into a background update: /~https://github.com/matrix-org/synapse/pull/9536/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it will run a long time. Here manually with 110 000 000 rows it took few hours. For small homeservers it would be fine. Do i understand it correctly it will iterate over every row in the table? If it does, it would be useful to add a third pr which removes the sql delta files here and in #10098 and do something like this as background update instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And gives the opportunity to test it now, with the slow version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, it's probably best to encapsulate the background job in a separate PR as it's not required for the main functionality of this PR anyhow.
The currently active indexes on this table are:
Because of the lack of
stream_id
, I believe we'd need to iterate over the table.Sure, people can run the SQL during operation manually if they'd wish, and then we just add a background update separately to clean it out in parallel for larger servers 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little off topic for here, but I noticed in our database while trying to do a similar cleanup that even selecting the rows to be removed was a really slow operation. Something like
SELECT * FROM device_inbox WHERE device_id = 'foo'
took several minutes to run. We might have to be a little clever in how we clean up this table.