element-web Room URLs are technically invalid

You're technically not allowed to have unescaped # symbols in a URL fragment, like we do for URLs like https://riot.im/app/#/room/#matrix:matrix.org. This causes problems for some conservative URL parsers like Ruby's, who appear not to have heard of Postel's law O:-) - see general unhappiness over at https://meta.discourse.org/t/broken-links-blank-page-with-appearing-in-url/52640/11.

To fix this I suggest we switch matrix.to to using compact URLs by default - i.e. solving https://github.com/matrix-org/matrix.to/issues/17, and then given people generally shouldn't be passing around riot URLs anyway, it's not a disaster if we encode the URLs 'correctly' with a %23 in them.

Or alternatively we can be thoroughly evil and encode them with a http://www.fileformat.info/info/unicode/char/2317/index.htm instead of a # :>

Mar 31 '17 18:03 ara4n

Do we have everything we need to move forward with this? Are we free to move forward with implementing:

https://matrix.to/#room:domain.tld and https://matrix.to/@user:domain.tld

If there are outstanding questions about the @user form, can we still move forward with the #room links independently?

Apr 06 '17 13:04 lampholder

technically @ is illegal there, but ruby doesn't choke on it. So i am inclined to go with it anyway, and if people complain further we can always provide a https://matrix.to/u/matthew:matrix.org style workaround or something horrid.

Apr 06 '17 13:04 ara4n

Why not just example.com/rooms#name:domain.tld?

Sep 08 '18 05:09 martindale

Because then you need to configure your Web server to serve the file for subpaths and not only the root

Sep 08 '18 09:09 t3chguy

You're requiring a web server in the first place, of course it needs to be configured. It seems to solve several issues at once without any drawbacks.

Sep 09 '18 03:09 martindale

Riot-web doesn't require a Web server, you can just browse a filesystem copy using browsers file:// protocol

Sep 09 '18 08:09 t3chguy

Replying to the original issue @ara4n . Could you please cite the section that states that you can't have # or @ within your anchor? I have an artistic project in which I generate most every character in the anchor and I would be very sad if parsers started making additional assumptions upon how a URI could or should be structured. (Yes, I'm looking at the markdown parser and the autolinker within Element breaking my links without additional mitigation, but that's a topic for a different room)

https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Syntax

: and @ may appear unencoded within the path, query, and fragment; and ? and / may appear unencoded as data within the query or fragment

The RFC actually defines a sane interpretation for using reserved characters unencoded:

https://datatracker.ietf.org/doc/html/rfc3986#section-2.2

If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII.

Incidentally, I myself have opted to always escape # for a different reason: my payload can be present in either the fragment or the query according to the desire of the user.

Sep 14 '22 15:09 bkil