JSONFeed icon indicating copy to clipboard operation
JSONFeed copied to clipboard

Relative URLs?

Open snej opened this issue 8 years ago • 11 comments

The spec doesn't mention relative URLs — are they allowed? And if so, what is the base URL they're interpreted relative to?

I recall this coming up as a bit of a compatibility issue in handling RSS/Atom feeds.

(My preference would be to allow them, since they make the feed more compact, and that they should be relative to the home_page_url, since that's likely to be a parent or sibling of where the individual articles' URLs will be. But then there are edge cases like what if there's no home_page_url...)

snej avatar May 17 '17 19:05 snej

We were just discussing this earlier today. I actually wish that all URLs (links and image references) would be absolute URLs, since there's less room for bugs between different feed reader implementations. But it seems difficult to try to enforce that.

I think you're right that the base URL should be home_page_url. It's strongly recommended to include it. Really all feeds that are on the web should have home_page_url.

manton avatar May 17 '17 20:05 manton

I would argue for absolute URLs. Relative makes sense if you expect majority of use cases to be by a site creating JSON to use within the same site.

Once you rely on a JSON feed for syndication, apps, and off-site usage it would likley introduce more problems than it solves (as per @matons point).

donohoe avatar May 17 '17 21:05 donohoe

I think the spec should allow relative urls, if for no other reason than people are going to use them whether the spec allows it or not.

In #38, I suggested splitting home_page_url into two fields (site_url and resource_url, with the latter serving the primary stated purpose of home_page_url) for semantic clarity. Following from that, I would propose that:

  • A base url can be specified explicitly with a new base_url field
  • A base url can be specified implicitly from home_page_url or resource_url, if either are present. (Specifically: the site_url proposed in #38 is not a considered source for this purpose because it is intended to describe a different semantic object.)
  • Parsers should follow the same rules for interpreting paths as the html <base> element.
  • Using paths (absolute or relative) without specifying a base url is a error. This is probably a semantic (not syntactic) error. I have no opinion on how parsers should indicate their displeasure.

jbafford avatar May 19 '17 02:05 jbafford

Given that simplicity seems to be key to json feed. It might make sense to have a strict rule for absolute urls.

The current suggestion is that if a feed is invalid, don't parse it. So, as long as people realize that their feeds are invalid when they use relative urls, and thus won't be parsed. They have a pretty strong incentive to use absolute urls.

voidfiles avatar May 19 '17 20:05 voidfiles

My vote is to require absolute URLs in JSON Feed itself.

However, relative URLs within content_html fields of items should be allowed, and the spec should be updated to clarify that such URLs should be interpreted relative to that item's url. A great many blogs represent intra-blog links (including image URLs) with relative links.

glv avatar May 23 '17 14:05 glv

Just to be clear, any url or _url field should definitely be an absolute URL. The way I interpreted this issue is that it was only about links in content_html. As others have pointed out, while relative URLs are annoying to deal with in a feed reader, it's too much to expect that feed generators will handle this. They'd have to parse a blog post's content and update all the links, <img>, etc. while serving the JSON.

manton avatar May 23 '17 14:05 manton

OK then … so the only question is, should relative URLs in content_html be based on the item's url or on the feed's home_page_url? Does it matter?

glv avatar May 23 '17 17:05 glv

(To be clear … for the typical case of a feed for a blog on a single site, either one should work fine. But what about the case for a feed that aggregates items from multiple sites?)

glv avatar May 23 '17 17:05 glv

@glv That's a great point about a feed aggregated from multiple sites. Micro.blog already serves a timeline feed like this that includes multiple users/sites. It would have to be relative to the item's URL in that case.

Also worth pointing out there are a few variations of relative URLs. It would greatly simplify feed readers if they could depend on relative URLs starting with /, and so easily constructed from the item URL or home page hostname. Again, I'm torn on making any of this a requirement, but it would be nice. (Blog systems with links that don't start with / are unlikely to work anyway across archive or category pages.)

manton avatar May 23 '17 17:05 manton

The case for relative URLs is that the code generating the feed may not know the details of the exact hostname/port the site is being served from, especially if the server is behind a proxy. And in general CS principles, it seems like a good approach to keep the code for a single page from having to know higher-level details like this. All the feed generator should really need to know is where its JSON file and the articles appear relative to each other.

As a user of various CMSs over time, I find it annoying when a working site breaks just because I moved it from a staging server to a real server, or because I put it under a path instead of at the root.

snej avatar May 24 '17 17:05 snej