Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3051: A scalable relation format #3051

Open
wants to merge 3 commits into
base: old_master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
371 changes: 371 additions & 0 deletions proposals/3051-scalable-relations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,371 @@
# MSC3051: Scalable relations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the spec does not use the term "relation" anywhere in the text.

Suggested change
# MSC3051: Scalable relations
# MSC3051: Scalable event relationships


Edits, reactions, replies, threads, message annotations and other MSCs have
shown, that relations between events are very powerful and useful. Currently the
format from [MSC2674](/~https://github.com/matrix-org/matrix-doc/pull/2674) is
used. That format however limits each event to exactly one relation. As a result
Comment on lines +4 to +6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSC2674 is now canon. Also, some minor grammar/wording improvements:

Suggested change
shown, that relations between events are very powerful and useful. Currently the
format from [MSC2674](https://github.com/matrix-org/matrix-doc/pull/2674) is
used. That format however limits each event to exactly one relation. As a result
shown that relationships between events are very powerful and useful.
However, the [current format](https://spec.matrix.org/v1.7/client-server-api/#definition-mrelates_to)
limits each event to at most one relationship. As a result

events rely on other ways to represent secondary relations. For example edits
keep the relation from the previous event. Their support to change or delete
that relation is limited. In theory you could pass that in `m.new_content`, but
clients don't seem to support that and the actual deletion of a relation is
unexplored as well.

There are many cases where 2 or more relations on an event would be useful. This
Copy link
Member

@ara4n ara4n Jun 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we seem to be coming up against more and more scenarios where having multiple relations on a single event could be useful. I've tried to summarise my original rationale at #4023 (comment) for sticking to a simple {Subject, Verb, Object} triple on relations, and concluding that the limits might outweigh the benefits - especially given the existence of extensible events, where we can decorate a given event with additional structured metadata; so why not also be able to decorate a given event with additional relations too. but tl;dr: i'd be supportive of changing to lists of relations rather than hacking around them with stuff like the is_falling_back field in MSC3440.

MSC proposes a simple way to do that and replace the currently proposed format.
Comment on lines +13 to +14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it useful to mention some of these use-cases. The only one I see below is "a description for multiple files", which I don't think even has a relation proposed.

Are there other use-cases you can think of that would be useful? The only one I know of is threads.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the other use case mentioned is replacing the original message's replied to message with an edit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing replies in an edit (or removing it), editing inside of threads, replying inside of threads.

Considering what relations we currently have:

  • replies
  • edits
  • threads
  • annotations
  • (references)

I can see it being useful for edits and threads, while for annotations and replies it might only be useful in combination with other relations. No idea about references, since those are currently not very well defined. I don't think it is that unlikely to say in the future there will be more relation types, that can benefit from it. (I.e. I could imagine wanting to reply to multiple messages, to show someone when something was mentioned before and other cool stuff)


## Proposal
Copy link

@chayleaf chayleaf Aug 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should potentially touch upon encryption as well? See /~https://github.com/matrix-org/matrix-doc/issues/2678 for ongoing discussion.

In short, aggregations are useful - so the server needs to be able to return all events relating to a specific message - but the server doesn't have to know any more than that. Potentially, even filtering by event type isn't needed (and if it becomes necessary, it can always be added later, adding unencrypted metadata is easier than removing it). In the unencrypted version of the message content, you could hash the event_id field using a message-specific salt, and rel_type could either be omitted or hashed as well; other data has not to be included.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What @Sorunome mentioned on #2678 is an idea we worked on together. Until that is properly worked out, we would just not encrypt the relations. In theory you don't need to know the actual values to aggregate relations with APIs. You can just tell the API what values it should aggregate for you. It is just less efficient and you run into trouble, if you automatically want to include the aggregations in the unsigned section. I think encryption for relations can be solved in an independent MSC, since it is quite a difficult topic. In theory privacy sensitive clients could also just not put unencrypted relations into the event at all, although currently that would be disallowed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, that makes sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating an unencrypted protocol and trying to layer encryption on top later is not a good way to make a secure protocol. We should avoid adding leaky features until the encryption had been sorted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevincox, you can complain about that on the original relations MSC. I made this MSC to fix some issues with the original MSC, but I didn't want any big changes that would make them hard to compare or lead to additional bikeshedding. I think encrypted relations can just be a separate relation type without the need of having to define the exact format upfront, the same way that Matrix defined messages first and later added an encrypted type.


To support multiple relations per file this MSC proposes the following format:

```json
{
"content": {
"m.relations": [
{
"event_id": "$some-other-event",
"rel_type": "m.in_reply_to"
},
{
"event_id": "$some-third-event",
"rel_type": "m.replaces"
},
{
"event_id": "$event-four",
"rel_type": "org.example.custom_relation",
"key": "some_aggregation_key"
}
]
},
"event_id": "$something",
"type": "m.room.message"
}
```

This has a few benefits:

- You can relate to multiple events at the same time. (I.e. you have a
description for multiple files you sent.)
- You can have multiple different relation types at once. (I.e. an edit, that
is also a reply, or a reaction inside a thread.)
- You don't need to look up reply relations in multiple events for edits. The
edited event is canonical and can be used standalone, without having to look
up the original event to figure out, what was replied to. You can also remove
a relation with an edit now. (Useful if you replied to the wrong message or
didn't mean to reply to anyone.)
Comment on lines +50 to +54
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure I follow what this is suggested. Does this propose changes to MSC2676? I don't see how this really helps, maybe this section could use an example of an event which gets edited twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few examples of how this would affect the other relations in 2 Appendices. Those are just ideas but not actual changes to those MSCs, since that is probably better done on those MSCs.

- This format is conceptually a lot simpler, if an event has multiple relations.
You don't run into issues with packing relations into `m.new_content`,
especially for encrypted events, etc. You just have a list of relations.

If clients want to stay backwards compatible (for a while at least), in many
instances it is possible to generate an `m.relates_to` object from the relations
list. This can be done by picking a primary relation, i.e. the edit relation,
and then packaging up the remaining relations in `m.new_content` or simply
throwing them away. Since this proposal uses `m.relations`, this does not
conflict with the current relations from the other MSCs. One can also generate
the relations object from this MSC from the old relations, since the new
relations are a strict superset, which may be useful to make handling inside of
a client easier.
Comment on lines +59 to +67
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unclear how this would be implemented, is there a prioritized list of what relations to use? What happens if m.relates_to conflicts with m.relations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is m.relations, you use that. it is a superset of m.relates_to and if a client sends both, the m.relates_to is probably a fallback.

A concrete example of how one can implement the fallback parsing logic is here: /~https://github.com/Nheko-Reborn/mtxclient/pull/48/files#diff-6c2fae13f9cbfbde2c2f9e0f681b252e3d6f33df71d3f495637ce6e17b1286a9R211-R263

Basically for parsing you can always convert relations to the new format by just parsing any relation you can and stuffing the in the list. One issue is that replies might get lost, for that we use a flag to indicate, that this was generated and in that case use the normal lookup rules for what an edit is a reply to.

Emitting the fallback is a bit more tricky. What my implementation does, is that it orders relations by priority. If something is an edit, we send an edit in m.relates_to. Otherwise we just send the first relation we find, because the other relations usually don't get combined so far.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to do some assertion that what's in m.relates_to is the same as what's in m.relations, otherwise this could be used to show different clients different content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does sound reasonable, but I don't think it is that important and we don't have the same for the other fallbacks either. I.e. the edit fallback, the reply fallback. If you implement sanity checks for those, you probably want them here too. Tbh, I would prefer to keep the period where one needs to emit a fallback to a minimum, because I don't like that clients can see different things, but #2781 doesn't seem to be a priority for anyone either, so the consensus seems to be, that this is an acceptable risk.


## Potential issues

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fallback needs to be touched upon as well - mostly just what should clients send as fallback info


### Ordering

The list of relations is not hierarchical. As such there is no order like where
you have a top level relation and a lower level relation like an edit having
priority over a reply.

I don't believe that is an issue in practice. If you edit a message with a
reply, there is a natural meaning to the combination of both relations. You can
even apply them in any order, imo. But there may be other relations, where this
causes more issues. An MSC introducing such a relation should specify how to
handle conflicts then.
Comment on lines +77 to +81
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is saying if you have a message that contains an edit relation and a reply relation would mean:

Find the event that the edit refers to and replace it with this event, which is now also a reply.

That seems fairly hierarchical to me and I don't see how you can apply those in the opposite order to me?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on how you implement your client. You can render the event as a reply first, and then place it at the location of the event, that was edited. Or you replace the event data in the database for that location first, then tell the UI to rerender that event, and it will naturally pick up that this event now is a reply to X.

At least in my clients, rendering events is usually a sequence if "is this a reply?", "is this an edit?", "is this in a thread?", but those things can be applied pretty much independently. There isn't really a need to order it protocol wise, because my clients just pick from the list, what they need. If you have [edit, reply] or [reply, edit], that should be easy to handle.

Alternatively, you could make it hierarchical, specify what each relation can contain as other relations. But I can't see much benefit there, it is just making a more complicated list/graph. I.e. if you have:

{
  "rel_type": "m.thread",
  "event_id": "$something",
  "m.in_reply_to": { "event_id": "$abc" }
}

What is the benefit over:

{
  {
    "rel_type": "m.thread",
    "event_id": "$something"
  },
  {
    "rel_type": "m.in_reply_to",
    "event_id": "$abc"
  }
}

In my case I found the first one to be harder to work with, because I needed to add a lot of special cases to the parser, while the second one didn't make the UI any harder to implement, while the SDK is much simpler. The first one also doesn't tell me how to extend it to support edits, that would be another special case, while in the second one it is natural. And the first one actually needs you to define an order, while a client might have an easier time, if the order was different.

I guess what I am trying to say, I don't see an explicit order that helpful. It is very much like a() && b() && c(), while that statement does have an order, if a, b and c don't have sideeffects, the result is the same, even if you reorder it.


### Conflicting relations

Some relation types should probably not be combined. For example you may
disallow editing a reaction, because clients probably won't be handling that
correctly. This MSC however does not disallow that. Specifications that define relations should specify,
how clients should handle that and clients sending such combinations should be
aware, that those probably won't get handled. I don't think just allowing 1
relation is the solution to handling such conflicts and I don't think they will
happen much in practice.
Comment on lines +85 to +91
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is up to this MSC to define how this would work for the existing relations (which are MSCs, but are widely implemented so are in a weird place, standards-wise).

I'm a bit nervous this will put us down a path where we need to have "relation rules" to define what a valid set of relations on an event is. This might be worth it, but would need to be thought through and could add a lot of complexity to servers (as it is another set of "auth rules").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added examples for this, why I think we don't need very strict rules for this and how implicit rules could look like. Maybe you can give me an opinion on that, if that is enough to resolve your concern or if the MSC actually needs to spell out explicit rules for conflict resolution.


There are some examples of conflict resolution in Appendix B.

## Alternatives

- We could just stick with the existing proposal to only have 1 relation per
event. This is obviously limiting, but works well enough for a lot of
relation types.
- There are a few other ways to structure relations like using an object instead
of an array, etc. I believe this is the most usable one.

## Security considerations

Multiple releations may increase load on the server and the client and provide
more opportunities to introduce bad data. Servers and clients should take
additional care and validate accordingly. It should not be considerably worse
than single relations though and servers may limit relations to a reasonable
amount (like they do for devices already).
Comment on lines +105 to +109
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite nervous at the potential for abuse here, it seems like it would be quite easy to put odd groups of relations together, maybe this is already possible with the current system and not made much worse though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the biggest problem with this MSC. But I think in practice the amount of shenanigans you can do is somewhat limited. One issue I found, is that one can basically make a reply point to "itself" by having the edit relation and the reply relation point to the same event. So some clientside validation is definitely needed (same for the server side pagination APIs), but most of that is fixed by just doing basic sanity checks (maximum recursion depths, not rendering a reply relation on reactions, etc), I think most of those validations are fairly natural and you will have a harder time with the other fields in events having bad data (i.e. all the crypto events trying to cause overflows when parsing or similar). I think even if you cause an issue by making weird combinations, the result should in most cases be pretty harmless.


## Unstable prefix

Clients should use `im.nheko.relations.v1.relations` instead of `m.relations`
and `im.nheko.relations.v1.in_reply_to` as the relation type for replies in the
mean time.

## Appendix

### Appendix A: Extended motivation

There are a few use cases, where I find a single relation limiting. A few of
those are listed below.

#### Replies + Edit

One common mistake when sending a message is, that I reply to the wrong message.
Currently in most clients the only way to fix that is to send a new message and
delete the old one. This was what we had to do with all messages before edits,
but edits only support changing the content, but not the relations of a message.

One obvious way to edit a reply with the single relation format is sending an
event like this:

```json
{
"type": "m.room.message",
"content": {
"body": "I <3 shelties (now replying to the right parent)",
"msgtype": "m.text",
"m.new_content": {
"body": "I <3 shelties",
"msgtype": "m.text",
"m.relationship": {
"rel_type": "m.reply",
"event_id": "$the_right_event_id"
}
},
"m.relationship": {
"rel_type": "m.replace",
"event_id": "$some_event_id"
}
}
}
```

In practice, almost no client supports this. One reason could be, that this is
not very obvious. Another could be, that clients remove the fallback and merge
the `m.new_content` into the top level, but explicitly try to preserve the edit
relation, so the reply relation gets lost.

This format also has some theoretical drawbacks though. It is very irregular. So
for a server to understand this format, it needs to know about edits. Otherwise
it can't list all events with a reply relation to a specific event. This makes
single relations not very generic or extensible, which makes client side
experiments much harder without server support.

It is also beneficial to always send the current reply relation in an edit
event. That way the edit can be somewhat rendered standalone without needing to
lookup the reply relation in the edited event.

If we support edits in the protocol, there is little reason to only be able to
edit specific user visible parts of an event instead of all of them. It is a
wart.

With multiple relations the behaviour is obvious. The following event is a reply
and an edit. If no reply relation is given in an edit, the reply relation is
removed (if there was any):

```json
{
"type": "m.room.message",
"content": {
"body": "I <3 shelties (now replying to the right parent)",
"msgtype": "m.text",
"m.new_content": {
"body": "I <3 shelties",
"msgtype": "m.text"
},
"m.relationships": {
"m.replace": {
"event_id": "$some_event_id"
},
"m.reply": {
"event_id": "$the_right_event_id"
}
}
}
}
```

#### Galleries ([MSC2881](/~https://github.com/matrix-org/matrix-doc/pull/2881))

Context:
/~https://github.com/matrix-org/matrix-doc/pull/2881#issuecomment-905905261

MSC2881 proposes to be able to send an event like this:

```json
{
"type": "m.room.message",
"content": {
"msgtype": "m.text",
"body": "Here is my photos and videos from yesterday event",
"m.relates_to": [
{
"rel_type": "m.attachment",
"event_id": "$id_of_previosly_send_media_event_1"
},
{
"rel_type": "m.attachment",
"event_id": "$id_of_previosly_send_media_event_2"
}
]
}
}
```

This is a description, that groups 2 media events together and gives them a
common description (similar to how some other chat apps automatically group a
large batch of pictures). You should be able to reply with that and edit the
description. Because the media is sent are sent as single events first, this
automatically works on clients not implementing this and gives you a rough
progress report, but still allows the timeline to stay clean, if someone opens
the room later. This simply is not possible in this form without multiple
relations.

#### Threads ([MSC3440](/~https://github.com/matrix-org/matrix-doc/pull/3440))

Threads are a much requested feature. MSC3440 proposes a thread relation in the
following format:

```json
"m.relates_to": {
"rel_type": "m.thread",
"event_id": "$thread_root"
}
```

This is a very simple relation, but pretty powerful. However, this again
interacts with all other relation features, that currently make Matrix great.

You can somewhat reply in a thread, because replies still use a different
format:

```json
"m.relates_to": {
"rel_type": "m.thread",
"event_id": "$thread_root",
"m.in_reply_to": {
"event_id": "$event_target"
}
}
```

This however prevents us from ever making replies a normal relation, if we only
allow a single relation.

Alternatively, reactions and edits do work in threads, but their behaviour is
not obvious. If a reaction or edit relates to an event in a thread, it is then
shown in the thread. This however means, that a server can't just allow clients
to filter by thread without explicitly supporting threads. It needs to always
query if the original event is in a thread instead of just returning all events
with a specific `rel_type` and `event_id`.

There is also no way to edit an event into a thread. Assuming you replied into
the wrong thread or none at all, there is no way to retroactively fix that,
because you can't easily add a thread relation by editing an event. The first
example in this Appendix describes the obvious way to do this with multiple
relations. In theory it would also be a very powerful tool, if moderators could
"move" messages into threads too by editing them. (Currently only the sender can
edit an event, but there are usecases, where you might want to also allow mods
to do this.)

I would argue threads would be a much richer experience, if we allowed users to
combine them with any kind of relation! You could even weave threads together
and make a conversation "fabric"!

#### Replies to multiple events

Often times people ask similar questions in a conversation. One way to focus the
conversation would be threads. Alternatively it could be very useful to just
reply to multiple people, so that everyone knows, that they are adressed. The
current solution is to just mention everyone by username, but sometimes that is
confusing, especially if one of the questions was asked further back in the
timeline.

### Appendix B: How would this work for existing relation types

This sections gives some examples of how multiple relations could interact on
different events. These are not actually part of the proposal, but just
suggestions to understand the format better.

#### Replies

If your client can't handle it, just pick the first reply from the relations
list. In the future this might be extended to reply to multiple messages at the
same time.

#### Edits

Having one edit apply to multiple events should probably be illegal. In this
case the first edit of the event is picked and the others are ignored.

#### Edits + Replies

Having an edit and a reply relation is well formed. In this case the new reply
relation replaces the reply relation of the original event.

#### Edits + Reactions

Having an edit and a reaction relation is illegal. You are not allowed to edit
reactions currently and this MSC would not change that.

#### Threads + Edits/Replies

Having a thread and an edit relation makes obviously sense. This is an edit in a
thread. Same applies to replies in threads. Clients may choose not to render
those replies to provide a simpler (Slack style) view for threads, but often
that has been voiced as negative feedback on threads. There are a few vocal
users that want replies in threads.

#### Threads

Having multiple thread relations could be interesting. It would allow you to
"join" or "cross" threads. Whether clients want to actually render that or not I
have no opinion on, but the idea sounds interesting.

#### Threads + Reactions

This would make it easier to filter a room by a thread relation, but still have
reactions visible on the `/message` pagination.

#### Annotation + Replies

This probably makes no sense. If it is a reaction event, you probably want to
render it as such, otherwise render it as a reply. Alternatively, just pick the
first relation.

This provides a way for a malicious client to make events render differently on
clients. But it just adds one more way to send invalid relation data. The client
could also just send invalid event ids, combine `m.room.message` with an
annotation relation and similar nonsense variants. While this adds one more way
to do that, I don't think it matters all too much.

#### Attachement (MSC2881)

Attachements are one of the motivating usecases for this proposal. They allow a
client to pull multiple media events together into a gallery with a description.
Obviously you want to be able to edit that description or reply with such a
"Gallery". You might want to do that in a thread. But sending one as a reaction
probably makes very little sense.

#### Conclusion

Most combinations are very simple and somewhat orthogonal. Clients can decide,
which combinations they want to support. In some cases they might want to
validate a minimal sensible set of supported combinations on parsing, but even
if they don't, UI restrictions will in most cases lead to a sensible solution.
There are a few edge cases, that can be abused, but the impact of that is
minimally bigger of just combining invalid event type and `rel_type` or sending
otherwise invalid data.