Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2108: Sync over Server Sent Events #2108

Open
wants to merge 2 commits into
base: old_master
Choose a base branch
from

Conversation

stalniy
Copy link

@stalniy stalniy commented Jun 11, 2019

Fixes matrix-org/matrix-spec#475
full proposal

Introduction

Currently, Matrix clients use long polling to get the latest state from the server, it becomes an issue when you have a lot of clients because:

  • homeserver needs to process a lot of requests, which by the way may return nothing
  • it affects network bandwidth, each time client sends request it creates a new HTTP request with headers, cookies and other stuff
  • for mobile clients, it spends their cellular Internet and eats money
  • when homeserver uses SSL over HTTP (what is recommended), clients are doing again and again the most expensive operation, the TLS handshake

So, instead of long polling I propose to implement sync logic over Server Sent Events(SSE)

Proposal

Server Sent Events(SSE) is a way for servers to push events to clients. It was a part of HTML5 standard and now available in all major web and mobile browsers.
It was specifically designed to overcome challenges related to short/long polling. By introducing this technology, we can get the next benefits:

  • only 1 persisted connection per client that is kept open "forever".
  • SSE is built on top of HTTP protocol, so can be used in communication between servers
  • SSE is more compliant with existing IT infrastructure like (Load Balancer, Firewall, etc)
  • web and mobile browsers support automatic reconnection and Last-Event-Id header out of the box
  • Matrix protocol is built over HTTP, so SSE should fit good in protocol specification

@turt2live
Copy link
Member

@stalniy thanks for making this! Just two things need to be done before this is ready for review:

  • Please line wrap around 90 characters for easier review
  • Please Sign Off on the changes so we are able to merge the proposal later.

@turt2live turt2live changed the title Sync over Server Sent Events MSC2108: Sync over Server Sent Events Jun 11, 2019
@turt2live turt2live added the proposal A matrix spec change proposal label Jun 11, 2019
@stalniy
Copy link
Author

stalniy commented Jun 11, 2019

Cool! will do in few hours

Signed-off-by: Sergii <sergiy.stotskiy@gmail.com>
@stalniy
Copy link
Author

stalniy commented Jun 11, 2019

Added Signed-off-by: Sergii <sergiy.stotskiy@gmail.com> to commit message.
Also wrapped lines ~ 90 characters in md file.

Copy link
Contributor

@Half-Shot Half-Shot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea sounds quite neat but more clarifications are needed in the doc.

proposals/2108-sync-via-server-sent-events.md Outdated Show resolved Hide resolved
proposals/2108-sync-via-server-sent-events.md Outdated Show resolved Hide resolved
proposals/2108-sync-via-server-sent-events.md Show resolved Hide resolved
proposals/2108-sync-via-server-sent-events.md Outdated Show resolved Hide resolved
proposals/2108-sync-via-server-sent-events.md Outdated Show resolved Hide resolved
* instead of using the `since` query parameter, the next batch token will be passed through the `Last-Event-ID` header.
* each event will have the same format as what `/sync` returns. The id of each event will be the `next_batch` token
* the server sends events in exactly the same way that it would send responses to `/sync` calls with the `since`
parameter set to the previous `next_batch`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per above, perhaps you could give a description on what an event is in SSE terms? I assume a SSE event is the body being sent along the wire, but it's not explicit here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added links to documentation for the shape of matrix sync endpoint and for the format of SSE event.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still doesn't make any sense to me: how is a client meant to resume their sync stream? All they'd have is a Last-Event-ID and a filter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If their call to /sync/sse gets killed, then they just make a new call, with the Last-Event-ID header set to the last SSE event ID that it saw.

It may be helpful to give an example of how the requests/responses look, so that people don't have to go digging through the SSE docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to examples. The SSE docs are a bit too complicated to try and parse in parallel here. Doesn't even need to be valid SSE:

GET /_matrix/client/r0/sync/sse
Last-Event-ID: 12

{"sync": "response_here"}

proposals/2108-sync-via-server-sent-events.md Show resolved Hide resolved
proposals/2108-sync-via-server-sent-events.md Outdated Show resolved Hide resolved
proposals/2108-sync-via-server-sent-events.md Outdated Show resolved Hide resolved
@stalniy
Copy link
Author

stalniy commented Jun 18, 2019

I looked through the changelog of Client-Server communication but cannot find any information of why /events API was deprecated. Can somebody clarify please?

In terms of SSE, it make sense to publish new events to clients as soon as changes are applied by homeserver, so I guess that most of the times almost all fields of /sync response will be empty, except 1 which actually contains payload for a single event that represent a change.

So, right now I have doubts whether it make sense to reuse the payload format of /sync endpoint

@stalniy
Copy link
Author

stalniy commented Jun 25, 2019

Any news?

@turt2live turt2live self-requested a review June 25, 2019 21:27
@uhoreg
Copy link
Member

uhoreg commented Jun 25, 2019

I looked through the changelog of Client-Server communication but cannot find any information of why /events API was deprecated. Can somebody clarify please?

I'd guess that it's because /sync can do the same thing, but includes other things too.

In terms of SSE, it make sense to publish new events to clients as soon as changes are applied by homeserver, so I guess that most of the times almost all fields of /sync response will be empty, except 1 which actually contains payload for a single event that represent a change.

So, right now I have doubts whether it make sense to reuse the payload format of /sync endpoint

I don't think it should be too bad, because all of the top-level items (except for next_batch) in the response can be omitted, so if the payload is just a single event, then only the relevant sections will be included. (I don't think synapse omits empty sections in /sync, but that's synapse's fault.)

@stalniy
Copy link
Author

stalniy commented Jun 26, 2019

Is there something else I can do to move this forward? Or may I start playing with implementation?

but instead propose to use a different underlying technology to do this. So:
* lets expose `/sync/sse` URL for SSE in order to be backward compatible with other clients and servers
* this URL returns the same data as continually calling `/sync`
* it accepts the same parameters as `/sync`, except `since` and `timeout`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does set_presence work in this proposal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. My guess is that it would be equivalent to continually calling /sync with the same set_presence parameter, for as long as the SSE connection is alive. When the SSE connection dies, then the behaviour would be the same as a client no longer calling /sync. Maybe this should be made more explicit.

but instead propose to use a different underlying technology to do this. So:
* lets expose `/sync/sse` URL for SSE in order to be backward compatible with other clients and servers
* this URL returns the same data as continually calling `/sync`
* it accepts the same parameters as `/sync`, except `since` and `timeout`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does full_state work in this proposal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I think that there are three options:

  1. drop full_state for SSE. If a client wants to do full_state, they can call /sync to start with, and then continue the sync using SSE.
  2. only return the full state for the first SSE event, and incremental state for subsequent SSE events
  3. return the full state for event SSE event. The first SSE event gets sent immediately (as it would when calling /sync with full_state), but subsequent SSE events only get sent when there's actually something new

Personally, I think my preference would be 2, then 1, then 3, with 1 and 2 being fairly close, and 3 being much less preferred.

So, instead of long polling I propose to implement
sync logic over [Server Sent Events][mdn-sse](SSE)

## Proposal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no mention of how limited syncs work in this proposal. Syncs can be limited due to a filter or due to the server's maximum willingness to serve events.

The existing sync endpoint is built for long polling, which doesn't really make it suitable for SSE. Although the sync format does lend itself to being a nice and backwards compatible data structure, I think it would be best if we used a more stream-oriented structure for SSE.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought on this was that it would work just like calling /sync repeatedly. For example, if you call /sync/sse with the Last-Event-ID set to some value, and the equivalent /sync call would have return a limited sync, then the first SSE event that you get will also be a limited result. Also, if the server is streaming SSE events, and you suddenly get a billion events in one room, then the next SSE event that it sends will be limited. It doesn't quite fit perfectly with a streaming model (for example, calling /sync/sse with the same Last-Event-ID two times won't give you the same result for the first event), but it fits with the way that Matrix currently handles truncating results. And I don't think SSE has a native method of skipping over events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, adding some examples to the proposal may be useful.

@turt2live
Copy link
Member

turt2live commented Jun 26, 2019

Is there something else I can do to move this forward? Or may I start playing with implementation?

@stalniy: typically review takes a lot longer than we'd like. Implementations help prove that a MSC is needed and serves a purpose, however at this early stage in the proposal process it doesn't make much sense to engineer a perfect implementation. Something which shows the idea as a proposal aide is certainly worthwhile, given the complexity of this proposal (ie: a plain HTML+JS page which dumps JSON bodies into the DOM as a demo). A proper implementation demonstration is required later in the proposal stages.

For the time being, it's best to bring it up every couple months in #matrix-spec:matrix.org to try and solicit review.

@ananace
Copy link
Contributor

ananace commented Oct 18, 2019

Just to add a little note here as well, I decided to write my own little personal testbed for a Server-Sent Events backend as a separate little middleware server in Ruby.
/~https://github.com/ananace/ruby-sinatra-matrix_sse/

It should fit in beside a regular Synapse instance, and just requires a reverse proxy to route /_matrix/client/r0/sync/sse to it, at which point it will take over and run a /sync-loop for each request.

@IngwiePhoenix
Copy link

Any updates? This would be a very important feature to have, especially for mobile users or areas with very spotty internet connection where a multitude of requests actually hurts performance quite a lot.

@stalniy
Copy link
Author

stalniy commented Mar 20, 2020

This proposal needs to be updated in order to be reviewed again. I’m currently working on my open source project and don’t have time to fix this.

@IngwiePhoenix you can join and help to fix all suggestions and merge back in my repo so we can continue discussion in the same PR.

@ShadowJonathan
Copy link
Contributor

@stalniy @IngwiePhoenix is there a possibility this MSC will be picked back up? If not, do any of you two know how we can push this MSC forward? I think there'd be a lot of interest for this.

@kegsay
Copy link
Member

kegsay commented Apr 3, 2021

An extension to MSC 3079 to support WebSockets may be an alternative to this proposal. It requires more client and server work, but saves far more bandwidth than this proposal can hope to. Servers can use a proxy initially (so it doesn't help server resources too much) and clients can use HTTP shims/interceptors.

@stalniy
Copy link
Author

stalniy commented Apr 3, 2021

I’ll try to find some time in the next few weeks to update this. However, I’m not a Python expert so would be cool if somebody could implement it eventually

@ShadowJonathan
Copy link
Contributor

However, I’m not a Python expert so would be cool if somebody could implement it eventually.

I volunteer, I've been working with the synapse codebase for a little while, MXID is @jboi:jboi.nl if you wanna contact me.

@stalniy
Copy link
Author

stalniy commented Apr 3, 2021

Awesome! @ShadowJonathan I will ping you when this proposal settles down.

@turt2live turt2live added the needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. label Jun 8, 2021
@Polve
Copy link

Polve commented Aug 12, 2022

Are there any news on the topic?
This seems like a very interesting solution, even if personally I would prefer something based on websockets+STOMP

@anoadragon453
Copy link
Member

@Polve This will probably be made redundant by MSC3575.

@ananace
Copy link
Contributor

ananace commented Aug 15, 2022

@anoadragon453 I'd personally like to see this MSC (or a followup MSC) transition to "Sliding Sync over Server Sent Events", since there are quite a few places where there is native support for SSE for event streams, which offer both performance and power usage improvements over repeating application-triggered requests. Android being one such example.

@mcg-matrix
Copy link

[...] since there are quite a few places where there is native support for SSE for event streams, which offer both performance and power usage improvements over repeating application-triggered requests. Android being one such example.

I have no clue whether that's true; but in my opinion, reducing battery usage of Element-Android (while providing instant notifications, without involving Google or other 3rd parties and points of failure) should be considered most important.

@anoadragon453
Copy link
Member

@ananace Indeed. It's worth noting that the Sliding Sync MSC explains why it proposes HTTP long-polling over WebSockets (or SSE) which is simply that it's more of an incremental change for today's implementations to transition from Sync v2.

However the MSC also states that the design of the proposal makes it easy to transition to a WebSockets implementation in the future after Sliding Sync is implemented, which is true. I think it'd be slightly more of a leap to SSE, as SSE doesn't define methods for clients to communicate back to the server. In Sliding Sync, the client is constantly asking for specific data (the messages in the room I'm currently looking at, the rooms I can currently see on my screen, etc.) and this information is constantly updating.

So I think you'd end up with SSE + an endpoint the client would keep hitting to get the server to send down different events. And that doesn't seem all that different from the bi-directional pipe that WebSockets provides though.

The document does mention some other advantages to SSE; I'm not sure how much weight those have.

@anoadragon453
Copy link
Member

@mcg-matrix None of this will help you if your Element Android client is in the background and your phone has stopped the process. You still need push notifications to get instant notifications.

Check out https://unifiedpush.org/ if you'd like a push solution that doesn't involve Google and is supported by Element Android and other applications.

@ananace
Copy link
Contributor

ananace commented Aug 15, 2022

@anoadragon453 Yeah, an SSE version of sliding sync would be a very different beast from an SSE version of current sync.

I had some design ideas on the subject when in other discussions of MSC3575, and I still think it'd be useful as a transport - just maybe not for the most general use-case.

@mcg-matrix
Copy link

None of this will help you if your Element Android client is in the background and your phone has stopped the process.

Sounds obvious. :-) I was not trying to find a way out of such a situation.

Check out https://unifiedpush.org/ if you'd like a push solution that doesn't involve Google and is supported by Element Android and other applications.

Thanks, I had heard of UnifiedPush; I wouldn't like any 3rd party or point of failure in addition to Element + Homeserver for my users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace long polling to receive events with Server Sent Events (SSE)