node-xml2js
node-xml2js copied to clipboard
Question about parsing XML attribute names
I have been using Node.js exec() with jq/yq/xq to convert XML<->JSON, and comparing the output for XML->JSON with this xml2js to xq, the JSON data returned is different.
const { error, stderr, stdout } = await exec(`cat ${file}|xq`, {maxBuffer: 1024 * 1024 * 1024});
value of stdout appears as:
{
mediawiki: {
'@xmlns': 'http://www.mediawiki.org/xml/export-0.10/',
'@xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
'@xsi:schemaLocation': 'http://www.mediawiki.org/xml/export-0.10/ http://www.mediawiki.org/xml/export-0.10.xsd',
'@version': '0.10',
'@xml:lang': 'en',
siteinfo: {
sitename: 'Wikipedia',
dbname: 'enwiki',
base: 'https://en.wikipedia.org/wiki/Main_Page',
generator: 'MediaWiki 1.38.0-wmf.24',
case: 'first-letter',
namespaces: [Object]
},
page: [
[Object], [Object], [Object], [Object], [Object], [Object]
]
}
}
but when using:
xml2js.parseString(await fs.promises.readFile(file, 'utf8', (err, data) => { if (err) throw err; return data; }), function (err, result) { console.dir(result); });
the output appears as:
{
mediawiki: {
'$': {
xmlns: 'http://www.mediawiki.org/xml/export-0.10/',
'xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
'xsi:schemaLocation': 'http://www.mediawiki.org/xml/export-0.10/ http://www.mediawiki.org/xml/export-0.10.xsd',
version: '0.10',
'xml:lang': 'en'
},
siteinfo: [ [Object] ],
page: [
[Object], [Object], [Object], [Object], [Object], [Object],
]
}
}
Is there a reason for the difference? or a way to configure the output to match?
Edited to add:
let parser = new xml2js.Parser({
attrNameProcessors: [ function (name) { console.log(name); return name } ]
});
the names of the attributes still do not appear to retain the '@' symbol at the beginning. I couldn't find any options that preserve the data to appear exactly identical as the source XML data without modification. Is there something I missed?