-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a query param participant
to admin endpoint /members
to filter for participants in a room
#18040
base: develop
Are you sure you want to change the base?
Conversation
Thinking out loud, it might make more sense to make this an additional query parameter ( What do you think @H-Shay? |
Oh interesting - just to flesh it out, in your formulation if the query parameter |
@H-Shay that's correct, yep! Would you be able to rework the PR? Apologies for the busy work! |
(Taking this off the review queue in the meantime.) |
participant
to admin endpoint /members
to fetch the participants in a room
participant
to admin endpoint /members
to fetch the participants in a roomparticipant
to admin endpoint /members
to filter for participants in a room
@anoadragon453 I have re-worked the PR but seem to be unable to re-request a review - can you take a look when you get a chance or pop this back into the queue? Thanks so much! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall direction looks good, but we'll need to rework the query due to performance concerns I'm afraid.
|
||
The following query parameters are available: | ||
|
||
* `participant` - Optional. If provided and set to true, only returns currently joined members who have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* `participant` - Optional. If provided and set to true, only returns currently joined members who have | |
* `participant` (bool) - Optional. If provided and set to true, only returns currently joined members who have |
SELECT DISTINCT c.state_key | ||
FROM current_state_events AS c | ||
INNER JOIN events AS e USING(room_id) | ||
WHERE room_id = ? | ||
AND c.membership = 'join' | ||
AND e.type = 'm.room.message' | ||
AND c.state_key = e.sender |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this query is currently quite slow. Testing this on matrix.org with Matrix HQ as the room, this query took about 50 seconds to run. It would be much slower on typical homeserver hardware. Postgres' EXPLAIN
command measures the cost at cost=438880.86..440110.15 rows=122929 width=29
.
Rather than iterating through current_state_events
for room membership, the room_memberships
table could be used to check which users are currently joined to a given room. We can rewrite the query as follows:
SELECT DISTINCT rm.user_id
FROM room_memberships AS rm
INNER JOIN events AS e USING(room_id)
WHERE room_id = ?
AND rm.membership = 'join'
AND e.type = 'm.room.message'
AND rm.user_id = e.sender;
This cuts our cost down a little bit, but still not much; psql
rates the cost as cost=438882.94..440112.23 rows=122929 width=29
, and still taking about 30 seconds.
The majority of the cost comes from scanning events
for m.room.message
-type events from the given room:
-> Parallel Bitmap Heap Scan on events e (cost=4517.63..242635.10 rows=26950 width=58) (actual time=409.882..2154.587 rows=142547 loops=3)
Recheck Cond: (room_id = '!OGEhHVWSdvArJzumhm:matrix.org'::text)
Rows Removed by Index Recheck: 7928064
Filter: (type = 'm.room.message'::text)
Rows Removed by Filter: 138493
Heap Blocks: exact=31299 lossy=164255
-> Bitmap Index Scan on events_room_stream (cost=0.00..4501.46 rows=120367 width=0) (actual time=317.490..317.491 rows=843191 loops=1)
Index Cond: (room_id = '!OGEhHVWSdvArJzumhm:matrix.org'::text)
One possible way we could move forward is to add another column to room_memberships
called participant
that starts off as false
, and is set to true
once a user sends a message in a room. You would need a background job to populate this table from the current database, but going forwards it could be updated trivially, and read from very quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a column sounds practical given I don't think there's another way around scanning the events table for evidence of participation - in terms of this PR, I am thinking that it would make the most sense to open a separate PR to add the column and the background job to populate it, and then once that is complete add logic to this PR to read from the column - would that be a sensible approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@H-Shay that sounds right to me! #17512 may serve as a good example of adding the background job. I don't believe you'll need to bump SCHEMA_COMPAT_VERSION
as we're not removing any columns/tables; only adding.
The background updates documentation is useful here, and remember that boolean columns need to be noted in the port script.
You may find that you want to block the Admin API from returning any results until the background update is done - lest you get incomplete results. Up to you though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is #18068
The following query parameters are available: | ||
|
||
* `participant` - Optional. If provided and set to true, only returns currently joined members who have | ||
also posted to the room. Defaults to false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also posted to the room. Defaults to false. | |
also posted to the room. Having "posted" is defined as sending an `m.room.message` | |
event at some point in the room's history. Defaults to false. |
@H-Shay why only users who posted to the room? why not even the users who joined a room and left it without even sending a message? you know to get the historical room data |
@lamoboos223 the existing endpoint is only for currently-joined members. If you want to get historical members from a room, I suggest pulling the full membership state of the room and filtering by users whose membership value is |
Adds a query param
participant
to admin endpointGET /_synapse/admin/v1/rooms/{room_id}/members
which returns a list of all currently joined room members who have also posted to the room.