xmlconvert
xmlconvert copied to clipboard
xml_to_df ends up ignoring encoding
I am trying to open a xml file encoded in ISO-8859-1 (aka latin-1) using xmlconvert, yet even if I specify xml_encoding it still claims my input isn't proper UTF-8. My call and traceback are as follows:
> carbu_df <- xmlconvert::xml_to_df("./PrixCarburants_instantane.xml",
+ xml.encoding = "latin-1",
+ records.xpath = "//pdv | //prix",
+ fields = "attributes")
Error in read_xml.raw(charToRaw(enc2utf8(x)), "UTF-8", ..., as_html = as_html, :
Input is not proper UTF-8, indicate encoding !
Bytes: 0xE8 0x73 0x2D 0x4C [9]
> traceback()
4: read_xml.raw(charToRaw(enc2utf8(x)), "UTF-8", ..., as_html = as_html,
options = options)
3: read_xml.character(text, encoding = xml.encoding)
2: xml2::read_xml(text, encoding = xml.encoding)
1: xmlconvert::xml_to_df("./PrixCarburants_instantane.xml", xml.encoding = "latin-1",
records.xpath = "//pdv | //prix", fields = "attributes")
Loading the file using xml2::read_xml("./PrixCarburants_instantane.xml", encoding="latin-1") does work, and so does opening the file using Notepad and saving it as UTF-8 (which is a bit tedious). It appears to me that enc2utf8 and charToRaw somehow isn't doing its job when being confronted with direct latin-1 input.