urlpattern icon indicating copy to clipboard operation
urlpattern copied to clipboard

consider generating URL strings from a URLPattern

Open wanderview opened this issue 4 years ago • 32 comments

One of the most frequent pieces of feedback we have gotten is that developers would like to generate URL and component strings from a URLPattern. For example, something like:

const pattern = new URLPattern({ pathname: '/product/:id' });
const my_pathname = pattern.generate('pathname', { id: '12345' });
// my_pathname is "/product/12345"

There is a discussions thread about this in #41. I'm filing this issue to consider as a future enhancement.

wanderview avatar Aug 11 '21 20:08 wanderview

The reference library does have some behaviour we would probably want to smooth other, especially given people are likely to use this for user input.

In particular path segments should probably be always validated and encoded by default, i.e. no attacks using slashes or relative path segments (./..).

Also path-to-regexp docs don't specify any behaviour for non-named pattern segments, e.g. what would we do with new URLPattern("/*.jpg"), one option would be to just throw an error, alternatively they could be treated as numeric-indexed, although this is tricky in itself e.g.:

const pattern = new URLPattern(`/foo/*`);
// Almost certainly "/foo/test" is the right behaviour here, as there's only
// a single pattern perhaps the array wrapper could be optional too
pattern.generate("pathname", ["test"]);

const pattern2 = new URLPattern("/images/*.jpg");
// Less clear as to which of these would be correct, it would make sense that people
// would want to include the full name, although others might expect that only the * is replaced
pattern2.generate("pathname", ["1.jpg"]);
pattern2.generate("pathname", ["1"]);

It might also make more sense as a separate API as with path-to-regexp, especially if not all syntax is supported for the reason above. i.e. new URLTemplate("/product/:id").generate("pathname", { id: "12345" })

Jamesernator avatar Aug 14 '21 04:08 Jamesernator

If you translate wildcard to a capturing group, the expectation should be that only that group is being replaced during compilation:

const pattern = new URLPattern("/images/*.jpg");
pattern.generate('pathname', ['1']) // "images/1.jpg"

I'd treat wildcards as non-named pattern segments, implying that the way they are compiled is the same as for the named segments. The only difference is the compilation input:

// This is a named equivalent of the wildcard example above.
const pattern = new URLPattern("/images/:filename.jpg");
pattern.generate('pathname', { filename: "1" }) // "images/1.jpg"

Replacing a fixed portion of the pattern (".jpg") would be unexpected.

kettanaito avatar Aug 26 '21 13:08 kettanaito

Wildcards are unnamed matching groups today. So they do indeed get an automatically assigned numeric name.

wanderview avatar Aug 26 '21 14:08 wanderview

I was mainly addressing the comment from @Jamesernator in regards to the ambiguous match of *.jpg. I think the expectation is to have only the asterisk replaced, just as the expectation is that only a named parameter is replaced (:name.jpg). I think this may not be a concern.

kettanaito avatar Aug 26 '21 14:08 kettanaito

Just would like to chime in that this is very important, and that I can’t imagine a case where you read urls into objects without also turning objects into urls. Search pages for example have lots of generated links (think filters, pagination), but also read the url into local state.

Haroenv avatar Sep 02 '21 21:09 Haroenv

I were just merely considering if a result could be mutable... for example: Say that you are on shopify.com/categories/computers and want to navigate to games section shopify.com/categories/games

const pattern = new URLPattern({ pathname: ':vendor/categories/:category' })
// Get current url pattern
const result = pattern.exec(location.href)

result.pathname.groups.category = 'games' // Modify current url match result
location.href = result.toString() // change the url location

Thinking this could be useful for a adv search with 2 way data binding or something.

all doe a generate function could work as equally good, maybe mutable url pattern result isn't desirable?

jimmywarting avatar Feb 03 '22 11:02 jimmywarting

the pattern being mutable can make sense, as new URL is a precedence, but personally I would need to make a copy of the pattern for every change then, as you have many links with different state, as it's not the current state you want to show in the URL, but a possible future state

Haroenv avatar Feb 03 '22 12:02 Haroenv

Would be helpful for my use case. I use a URL pattern to

  1. determine type of page
  2. extract information from the url
  3. reconstruct the URL from extracted information

I'm sure this will be a common enough use case for URLPattern

guy-borderless avatar Jun 24 '23 12:06 guy-borderless

This API is highly benefitial for i18n/localization where localized URLs have to be mapped back to the original (deLocalized) URL. Here are docs for the use case that should be accounted for https://inlang.com/m/gerre34r/library-inlang-paraglideJs/strategy#url.

samuelstroschein avatar Feb 26 '25 20:02 samuelstroschein

I feel it's better to expose this API to the URLPattern interface, rather than making it mutable with adding toString() in URLPatternResult, which is currently defined as dictionary and doesn't have any method.

sisidovski avatar Mar 18 '25 08:03 sisidovski

We'd like to make a progress on this proposal to resolve https://github.com/w3c/IFT/issues/259.

I've summarized expected inputs and outputs here. I think this is still not comprehensive, appreciate your any feedback there. https://docs.google.com/document/d/1ca6geyHD40MfHkalgEv4AcBo9-rHK6CrAOxwcWor9WA/edit?usp=sharing

sisidovski avatar Apr 04 '25 01:04 sisidovski

@sisidovski Look into this code a custom generate function that is used for localization. Might reveal one or two insights.

samuelstroschein avatar Apr 04 '25 15:04 samuelstroschein

Suggested by @annevk in https://docs.google.com/document/d/1ca6geyHD40MfHkalgEv4AcBo9-rHK6CrAOxwcWor9WA/edit?disco=AAABhYZWjLY, but we might need to consider the way to generate a complete URLPattern instead of specifying the component. Note that the complete URL is somewhat touched in https://github.com/whatwg/urlpattern/issues/73#issuecomment-1028873691.

Will it make sense to pass a record to generate to deal with the complete URL? JavaScript code example might be something like:

const pattern = URLPattern("https://example.com/:path");
const result = await pattern.generate({'path': 'image/a.jpg'});
const generatedURL = result.toString();  // expecting "https://example.com/image/a.jpg"

What do you think?

yoshisatoyanagisawa avatar Apr 07 '25 02:04 yoshisatoyanagisawa

Yeah it makes sense consider generating a full URL.

fyi, the reference library only supports pathname to generate a string tho. https://github.com/pillarjs/path-to-regexp#compile-reverse-path-to-regexp

I'm wondering how we should handle the case if there are two named groups which have the same name?

const pattern = new URLPattern('https://{:pattern.}?example.com/:pattern');
const result = await pattern.generate({'pattern': 'image/a.jpg'});
// => ???

It looks the current design doesn't have a capability to substitute two strings which have same names in a different component.

sisidovski avatar Apr 07 '25 21:04 sisidovski

fyi, the reference library only supports pathname to generate a string tho. https://github.com/pillarjs/path-to-regexp#compile-reverse-path-to-regexp

I'm wondering how we should handle the case if there are two named groups which have the same name?

Yeah, applying it other than pathnames sounds challenging. Is it realistic to limit the generate application only within the pathname field?

const pattern = new URLPattern('https://{:pattern.}?example.com/:pattern'); const result = await pattern.generate({'pattern': 'image/a.jpg'}); // => ??? It looks the current design doesn't have a capability to substitute two strings which have same names in a different component.

Even if it is useless, I feel it natural to generate {'protocol': 'https', 'hostname': 'image/a.jpgexample.com', 'pathname': /image/a.jpg,...}`. It should be a web developer's responsibility to avoid practically unusable cases.

yoshisatoyanagisawa avatar Apr 08 '25 07:04 yoshisatoyanagisawa

Maybe duplicate names should throw? But I was wondering if we could avoid having to specify the component. At least it seems like it would be nice if you could do

const pattern = new URLPattern("https://{:host}/{:path1}/{:path2}/");
pattern.generate({ host: "example.com", path1: "foo", path2: "bar/baz" });
// "https://example.com/foo/bar%2fbaz/"

@garretrieger maybe you could also weigh in with regards to requirements?

annevk avatar Apr 08 '25 07:04 annevk

For the IFT use case the set of variables is fixed and have well defined values (base32/base64 encodings) so it would pretty unusual for IFT template to substitute one of the variables outside of the path or query string.

garretrieger avatar Apr 09 '25 00:04 garretrieger