Skip to content

Incorrectly parsing XML with duplicated tag names #11

@nurkiewicz

Description

@nurkiewicz

Trying to parse the following XML document:

<data>
  <r><a>A</a></r>
  <r><b>B</b><c>C</c></r>
</data>

with:

new XmlMapper().readValue(xml, Map.class)

ignores the first "r" (r -> {a -> A}) node, overriding it with a second one (r -> {b -> B, c -> C}). It should generate a map with a single key and array value instead: r -> [{a -> A}, {b -> B, c -> C}]. The problem is here (last line of org.codehaus.jackson.map.deser.MapDeserializer#_readAndBind):

            /* !!! 23-Dec-2008, tatu: should there be an option to verify
             *   that there are no duplicate field names? (and/or what
             *   to do, keep-first or keep-last)
             */
            result.put(key, value);

Although this can be worked around by using special map implementation instead of Map.class, but if the duplicated tags appear deeper in XML document (not at top level), there is no easy workaround, see org.codehaus.jackson.map.deser.UntypedObjectDeserializer#mapObject class (LinkedHashMap creation).

Of course the root cause of this problem is the assumption that there are no duplicate properties in JSON. In XML such nodes should be treated as arrays.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions