quick-xml icon indicating copy to clipboard operation
quick-xml copied to clipboard

Deserializing tag with attribute values into Map

Open alex-semov opened this issue 3 years ago • 3 comments

Assume we have below xml structure:

<container>
  <attr>
    <name>Name</name>
    <desc>Description</desc>
  </attr>
</container>

when <attr> tag doesn't have attribute values, quick-xml parses successfully into Map (HashMap, MultiMap) Container ( "attr" : { Name(Name), Desc(Description) } )

#[derive(Deserialize, PartialEq)]
pub enum AttrDefChoice {
    Name(String),
    Desc(String),
}

#[derive(Deserialize, PartialEq)]
pub enum ContainerDef {
    Container(MultiMap<String, AttrDefChoice>),
}

but when <attr> have name value I need some wrapper struct with rename = "$value". In that case it works only with attr: Vec<AttrDefChoice>.

<container>
  <attr name="AttrName">
    <name>Name</name>
    <desc>Description</desc>
  </attr>
</container>
...

#[derive(Deserialize, PartialEq)]
pub struct AttrDef {
    name: String,

    #[serde(rename = "$value")]
    attr: MultiMap<String, AttrDefChoice>,
}

invalid type: string "Name", expected internally tagged enum AttrDefChoice

Is there any way to achieve parsing of tags with values into Map?

alex-semov avatar Apr 18 '22 15:04 alex-semov

Generally, this is the same problem as #241 -- you have a struct with a field, that could get all internal markup of an XML node

<>
  <name>Name</name>
  <desc>Description</desc>
</>

Conceptually, there is no such a defined mapping from XML to Rust in quick-xml for that. You may think, that $value is supposed for that, and I agree with that, but $value semantic also is not defined precisily. To define such semantics, I opened #369.

It is certain that we require two kinds of $value:

  • $value which should mean map any (unconsumed?) elements of the field, so that
    <>
    text
    </>
    
    <>
    <![CDATA[content]]>
    </>
    
    <>
      <tag attribute="value">content</tag>
      <tag2/>
    </>
    
    all mapped to a dedicated field
  • #text which should mean to map only text/CDATA content and raise an error if markup is encountered.

The current $value does something middle of that two.

Mingun avatar May 07 '22 17:05 Mingun

I think I have a related issue, but want to confirm. Say I have the following structure:

<upper>
   <inner name = "a">1</inner>
   <inner name = "b">2</inner>
   <inner name = "c">3</inner>
</upper>

Ideally, I'd like to deserialize into the following:

{upper: {inner: {"a" : 1, "b" : 2, "c" : 3}}}

Do I have to write a custom deserializer impl for this?

snystrom avatar Jun 10 '22 00:06 snystrom

Yes, probably this is somehow related.

Your JSON example can be modelled by the following Rust code:

struct Upper {
  inner: Inner,
}
struct Inner {
  a: usize,
  b: usize,
  c: usize,
}

In XML this struct will be represented as

<any-tag>
  <inner a="1" b="2" c="3"/>
</any-tag>

or

<any-tag>
  <inner>
    <a>1</a>
    <b>1</b>
    <c>1</c>
  </inner>
</any-tag>

I think, that your original XML should be modelled as

struct Upper {
  inner: Vec<Inner>,
}
#[serde(tag = "name", content = "$value")]
enum Inner {
  a(usize),
  b(usize),
  c(usize),
}

(where $value matters, which I described above), but I have feeling that this not work currently (even if you ignore the consequences of the fact that due to serde-rs/serde#1183 it will not work with usize).

So if you want to get the Rust representation closer to your JSON representation, you in any case need the custom (de)serializer.

Mingun avatar Jun 10 '22 04:06 Mingun