matrix-doc/specification/rooms/v2.rst

.. Copyright 2018-2019 New Vector Ltd
..
.. Licensed under the Apache License, Version 2.0 (the "License");
.. you may not use this file except in compliance with the License.
.. You may obtain a copy of the License at
..
..     http://www.apache.org/licenses/LICENSE-2.0
..
.. Unless required by applicable law or agreed to in writing, software
.. distributed under the License is distributed on an "AS IS" BASIS,
.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
.. See the License for the specific language governing permissions and
.. limitations under the License.

Room Version 2
==============

This room version builds off of `version 1 <v1.html>`_ with an improved state
resolution algorithm.

.. contents:: Table of Contents
.. sectnum::

Server implementation components
--------------------------------

.. WARNING::
   The information contained in this section is strictly for server implementors.
   Applications which use the Client-Server API are generally unaffected by the
   details contained here, and can safely ignore their presence.


Room version 2 uses the base components of `room version 1 <v1.html>`_, changing
only the state resolution algorithm.


State resolution
~~~~~~~~~~~~~~~~

The room state :math:`S'(E)` after an event :math:`E` is defined in terms of
the room state :math:`S(E)` before :math:`E`, and depends on whether
:math:`E` is a state event or a message event:

* If :math:`E` is a message event, then :math:`S'(E) = S(E)`.

* If :math:`E` is a state event, then :math:`S'(E)` is :math:`S(E)`, except
  that its entry corresponding to :math:`E`'s ``event_type`` and ``state_key``
  is replaced by :math:`E`'s ``event_id``.

The room state :math:`S(E)` before :math:`E` is the *resolution* of the set of
states :math:`\{ S'(E_1), S'(E_2), … \}` consisting of the states after each of
:math:`E`'s ``prev_event``\s :math:`\{ E_1, E_2, … \}`, where the resolution of
a set of states is given in the algorithm below.

Definitions
+++++++++++

The state resolution algorithm for version 2 rooms uses the following
definitions, given the set of room states :math:`\{ S_1, S_2, \ldots \}`:

Power events
  A *power event* is a state event with type ``m.room.power_levels`` or
  ``m.room.join_rules``, or a state event with type ``m.room.member`` where the
  ``membership`` is ``leave`` or ``ban`` and the ``sender`` does not match the
  ``state_key``. The idea behind this is that power events are events that might
  remove someone's ability to do something in the room.

Unconflicted state map and conflicted state set
  The *unconflicted state map* is the state where the value of each key exists
  and is the same in each state :math:`S_i`.  The *conflicted state set* is the
  set of all other state events. Note that the unconflicted state map only has
  one event per ``(event_type, state_key)``, whereas the conflicted state set
  may have multiple events.

Auth difference
  The *auth difference* is calculated by first calculating the full auth chain
  for each state :math:`S_i`, that is the union of the auth chains for each
  event in :math:`S_i`, and then taking every event that doesn't appear in
  every auth chain. If :math:`C_i` is the full auth chain of :math:`S_i`, then
  the auth difference is :math:`\cup C_i - \cap C_i`.

Full conflicted set
  The *full conflicted set* is the union of the conflicted state set and the
  auth difference.

Reverse topological power ordering
  The *reverse topological power ordering* of a set of events is the
  lexicographically smallest topological ordering based on the DAG formed by
  auth events. The reverse topological power ordering is ordered from earliest
  event to latest. For comparing two topological orderings to determine which
  is the lexicographically smallest, the following comparison relation on
  events is used: for events :math:`x` and :math:`y`, :math:`x<y` if

  1. :math:`x`'s sender has *greater* power level than :math:`y`'s sender,
     when looking at their respective ``auth_event``\s; or
  2. the senders have the same power level, but :math:`x`'s
     ``origin_server_ts`` is *less* than :math:`y`'s ``origin_server_ts``; or
  3. the senders have the same power level and the events have the same
     ``origin_server_ts``, but :math:`x`'s ``event_id`` is *less* than
     :math:`y`'s ``event_id``.

  The reverse topological power ordering can be found by sorting the events
  using Kahn's algorithm for topological sorting, and at each step selecting,
  among all the candidate vertices, the smallest vertex using the above
  comparison relation.

Mainline ordering
  Given an ``m.room.power_levels`` event :math:`P`, the *mainline of* :math:`P`
  is the list of events generated by starting with :math:`P` and recursively
  taking the ``m.room.power_levels`` events from the ``auth_events``, ordered
  such that :math:`P` is last. Given another event :math:`e`, the *closest
  mainline event to* :math:`e` is the first event encountered in the mainline
  when iteratively descending through the ``m.room.power_levels`` events in the
  ``auth_events`` starting at :math:`e`. If no mainline event is encountered
  when iteratively descending through the ``m.room.power_levels`` events, then
  the closest mainline event to :math:`e` can be considered to be a dummy event
  that is before any other event in the mainline of :math:`P` for the purposes
  of condition 1 below.

  The *mainline ordering based on* :math:`P` of a set of events is the
  ordering, from smallest to largest, using the following comparison relation
  on events: for events :math:`x` and :math:`y`, :math:`x<y` if

  1. the closest mainline event to :math:`x` appears *before* the closest
     mainline event to :math:`y`; or
  2. the closest mainline events are the same, but :math:`x`\'s
     ``origin_server_ts`` is *less* than :math:`y`\'s ``origin_server_ts``; or
  3. the closest mainline events are the same and the events have the same
     ``origin_server_ts``, but :math:`x`\'s ``event_id`` is *less* than
     :math:`y`\'s ``event_id``.

Iterative auth checks
  The *iterative auth checks algorithm* takes as input an initial room state
  and a sorted list of state events, and constructs a new room state by
  iterating through the event list and applying the state event to the room
  state if the state event is allowed by the `authorization rules`_. If the
  state event is not allowed by the authorization rules, then the event is
  ignored. If a ``(event_type, state_key)`` key that is required for checking
  the authorization rules is not present in the state, then the appropriate
  state event from the event's ``auth_events`` is used if the auth event is
  not rejected.

Algorithm
+++++++++

The *resolution* of a set of states is obtained as follows:

1. Take all *power events* and any events in their auth chains, recursively,
   that appear in the *full conflicted set* and order them by the *reverse
   topological power ordering*.
2. Apply the *iterative auth checks algorithm* on the *unconflicted state map*
   and the list of events from the previous step to get a partially resolved
   state.
3. Take all remaining events that weren't picked in step 1 and order them by
   the mainline ordering based on the power level in the partially resolved
   state obtained in step 2.
4. Apply the *iterative auth checks algorithm* on the partial resolved
   state and the list of events from the previous step.
5. Update the result by replacing any event with the event with the same key
   from the *unconflicted state map*, if such an event exists, to get the final
   resolved state.


.. _`authorization rules`: ../server_server/%SERVER_RELEASE_LABEL%.html#authorization-rules

Rejected events
+++++++++++++++

Events that have been rejected due to failing auth based on the state at the
event (rather than based on their auth chain) are handled as usual by the
algorithm, unless otherwise specified.

Note that no events rejected due to failure to auth against their auth chain
should appear in the process, as they should not appear in state (the algorithm
only uses events that appear in either the state sets or in the auth chain of
the events in the state sets).

.. admonition:: Rationale

  This helps ensure that different servers' view of state is more likely to
  converge, since rejection state of an event may be different. This can happen if
  a third server gives an incorrect version of the state when a server joins a
  room via it (either due to being faulty or malicious). Convergence of state is a
  desirable property as it ensures that all users in the room have a (mostly)
  consistent view of the state of the room. If the view of the state on different
  servers diverges it can lead to bifurcation of the room due to e.g. servers
  disagreeing on who is in the room.

  Intuitively, using rejected events feels dangerous, however:

  1. Servers cannot arbitrarily make up state, since they still need to pass the
     auth checks based on the event's auth chain (e.g. they can't grant themselves
     power levels if they didn't have them before).
  2. For a previously rejected event to pass auth there must be a set of state
     that allows said event. A malicious server could therefore produce a
     fork where it claims the state is that particular set of state, duplicate the
     rejected event to point to that fork, and send the event. The
     duplicated event would then pass the auth checks. Ignoring rejected events
     would therefore not eliminate any potential attack vectors.


Rejected auth events are deliberately excluded from use in the iterative auth
checks, as auth events aren't re-authed (although non-auth events are) during
the iterative auth checks.