feedparser icon indicating copy to clipboard operation
feedparser copied to clipboard

Retain all tag information, even if mapped to a core attribute

Open mmcdole opened this issue 10 years ago • 1 comments

This is a design change that I think would be a positive move for feedparser. It is somewhat related to #24 but isn't exactly the same.

Right now, if feedparser is parsing a particular element and that element maps to one of it's 'common interface' elements, it is consumed and not accessible individually.

An example of this would be itunes:author. Because this is mapped to the core author field, it is not accessible via feed['itunes_author']. If feedparser's precedence rules make it so that another element also maps to the core author field and is a higher precedence, it is impossible to access the itunes:author information.

I think that all tags should be accessible manually and that the mapping to that common interface should be supplementary. It shouldn't throw away any information.

You could do this by making all elements individually accessible like so:

feed['rss:author']
feed['itunes:author']
feed['atom:subtitle']
feed['itunes:subtitle']

You would still be able to access elements via the common interface: feed['author'] or feed['subtitle'], according to the well documented precedence rules. However, if I, as an application writer, want to say, ensure that any iTunes element takes precedence over the other items, I can do this myself by specifying the individual elements themselves and bypass the common interface.

This is similar to the approach https://github.com/danmactough/node-feedparser takes and I think it allows for a lot more flexibility.

mmcdole avatar Sep 14 '15 20:09 mmcdole

I totally agree with @mmcdole I have been using other parsers just to get around some of these issues. Would be great if you can fix this.

Thanks again for great work on this library.

sadhiappan avatar Sep 14 '15 20:09 sadhiappan