Clarify usage of SRV record mechanics #15

Open
opened 2023-08-01 15:11:36 +02:00 by zocker.net · 0 comments

Problem Description

The spec currently (v1.7) states that for the Server-Server API, the server discovery might work via a well-known URI or via SRV records. SRV records (as per RFC 2782) allow defining a priority & weight for automatic fallback & weighted selection of multiple servers (similar to MX records have).

However, the spec does not clarify, if those rules should apply. More important, the spec does not state, when a re-selection should be applied (when first connecting to the server; every day;

The well-known method was introduced after the SRV record specification was introduced, but it does not support fallback & weighted selection at all.

Certificate Validation

As of now, when using SRV record delegation, the target servers still need to serve the certificate for the Matrix hostname (e.g. example.com SRV-deleagates to matrix1.example.com & matrix2.example.com, then both need to have a valid cert for example.com). This makes SRV delegation a little bit harder to use, especially for delegation to 3rd parties. However, assuming strong mutual authentication in the Client-Server connection and enforcing E2EE, this restriction might be useless (see this summary).

Implementation Status

Discovery Implementations

  • Synapse seems to follow the SRV record rules (unknown to what degree)
  • Construct ignores SRV priorities & weights for now (see issue)

Servers using SRV records

  • saklad5.com (see here)

Possible Solutions

  • remove SRV record delegation at all (disrupts servers using it)
  • states that re-selection applies
    • only on first connection (disabling fallback & traffic shaping)
    • only every time the other server seems offline (enabling fallback & limited traffic shaping)
    • daily (enabling fallback & more traffic shaping)
    • for every connection attempt (full fallback & traffic shaping, at cost of more computation)
  • (optional) add priority & weight fields to the well-known definition
    • well-known currently allows

Discussions in Matrix Room

## Problem Description The spec currently (v1.7) states that for the Server-Server API, the server discovery might work via a well-known URI or via SRV records. SRV records (as per [RFC 2782](https://www.rfc-editor.org/rfc/rfc2782)) allow defining a priority & weight for automatic fallback & weighted selection of multiple servers (similar to MX records have). However, the spec does not clarify, if those rules should apply. More important, the spec does not state, when a re-selection should be applied (when first connecting to the server; every day; The well-known method was introduced after the SRV record specification was introduced, but it does not support fallback & weighted selection at all. ### Certificate Validation As of now, when using SRV record delegation, the target servers still need to serve the certificate for the Matrix hostname (e.g. example.com SRV-deleagates to matrix1.example.com & matrix2.example.com, then both need to have a valid cert for example.com). This makes SRV delegation a little bit harder to use, especially for delegation to 3rd parties. However, assuming strong mutual authentication in the Client-Server connection and enforcing E2EE, this restriction might be useless (see [this summary](https://matrix.to/#/!oqWPfLVcyMbjqQflBy:pixie.town/$IpoZTkaZ4FvjNS28a1H9OX7hXrheD1jZ-dlJY7ZyT54?via=pixie.town&via=matrix.org&via=the-apothecary.club)). ## Implementation Status ### Discovery Implementations - Synapse seems to follow the SRV record rules (unknown to what degree) - Construct ignores SRV priorities & weights for now (see [issue](https://github.com/matrix-construct/construct/issues/25)) ### Servers using SRV records - saklad5.com (see [here](https://github.com/matrix-org/matrix-spec-proposals/pull/3922#pullrequestreview-1223322701)) ## Possible Solutions - remove SRV record delegation at all (disrupts servers using it) - already proposed in [MSC 3922](https://github.com/matrix-org/matrix-spec-proposals/pull/3922) - states that re-selection applies - only on first connection (disabling fallback & traffic shaping) - only every time the other server seems offline (enabling fallback & limited traffic shaping) - daily (enabling fallback & more traffic shaping) - for every connection attempt (full fallback & traffic shaping, at cost of more computation) - (optional) add priority & weight fields to the well-known definition - well-known currently allows ## Discussions in Matrix Room - [begin of first discussion round](https://matrix.to/#/!oqWPfLVcyMbjqQflBy:pixie.town/$mWtr1Gr5NGyxFnKXbOOzfmro1-4yqyZ_oAbhTLNQqvA?via=pixie.town&via=matrix.org&via=the-apothecary.club) - [summaries from some participants & (probable) end of discussion](https://matrix.to/#/!oqWPfLVcyMbjqQflBy:pixie.town/$yVSIYynL_EYWKLMx_o3_lqKEvpPo8uYqWDDRcLcN6CY?via=pixie.town&via=matrix.org&via=the-apothecary.club)
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Atoki/discussion#15
No description provided.