an angled bracket in title
Hi,
If I put the following feed into the library -
`
<rss version="2.0">
The parsed output is -
{
title: 'RSS >>',
description: 'New RSS tutorial on W3Schools',
summary: 'New RSS tutorial on W3Schools',
date: null,
pubdate: null,
pubDate: null,
link: 'http://www.w3schools.com/xml/xml_rss.asp',
guid: 'http://www.w3schools.com/xml/xml_rss.asp',
author: null,
comments: null,
origlink: null,
image: {},
source: {},
categories: [],
enclosures: [],
'rss:@': {},
'rss:title': { '@': {}, '#': 'RSS <<<Tutorial>>>' },
'rss:link': { '@': {}, '#': 'http://www.w3schools.com/xml/xml_rss.asp' },
'rss:description': { '@': {}, '#': 'New RSS tutorial on W3Schools' },
}
Please note how title contains the incorrect text, but rss:title has the right content.
@danmactough is there a option to pass when calling feedparser to remove '{ '@': {}, '#': value} and just get the value?
So instead of 'rss:link': { '@': {}, '#': 'http://www.w3schools.com/xml/xml_rss.asp' } to get 'rss:link: 'http://www.w3schools.com/xml/xml_rss.asp'?
@theasteve 'rss:link' is a "raw" element, meaning it isn't normalized and retains all the information in the original XML. As a result, we need to retain both the attributes (the @) and the text node (the #).
But generally, the item's link property will have the value you want.