Description
I'm experiencing a really odd issue while trying to deserialize some basic XML. Here's my setup:
final XMLInputFactory xmlInputFactory = XMLInputFactory.newFactory();
xmlInputFactory.setProperty("javax.xml.stream.isNamespaceAware", false);
final XmlMapper xmlMapper = new XmlMapper(xmlInputFactory);
xmlMapper.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES);
final RssDocument doc = xmlMapper.readValue(content, RssDocument.class);
POJOs:
private static class RssDocument {
public RssChannel channel = null;
}
private static class RssChannel {
public String title = null;
public String description = null;
@JacksonXmlProperty(localName = "link")
public String siteUrl = null;
@JacksonXmlProperty(localName = "lastBuildDate")
public String updated = null;
@JacksonXmlElementWrapper(useWrapping = false)
@JacksonXmlProperty(localName = "item")
public List<RssItem> items = null;
}
private static class RssItem {
@JacksonXmlProperty(localName = "guid")
public String localId = null;
@JacksonXmlProperty(localName = "pubDate")
public String updated = null;
public String title = null;
@JacksonXmlProperty(localName = "link")
public String articleUrl = null;
@JacksonXmlProperty(localName = "dc:creator")
public String author = null;
@JacksonXmlProperty(localName = "content:encoded")
public String encodedContent = null;
@JacksonXmlProperty(localName = "description")
public String content = null;
}
XML I'm trying to parse:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?>
<?xml-stylesheet type="text/css" media="screen" href="http://feeds.arstechnica.com/~d/styles/itemcontent.css"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" version="2.0">
<channel>
<title>Ars Technica</title>
<link>https://arstechnica.com</link>
<description>Serving the Technologist for more than a decade. IT news, reviews, and analysis.</description>
<lastBuildDate>Sun, 23 Aug 2020 23:48:25 +0000</lastBuildDate>
<language>en-US</language>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<generator>https://wordpress.org/?v=4.9.15</generator>
<image>
<url>https://cdn.arstechnica.net/wp-content/uploads/2016/10/cropped-ars-logo-512_480-32x32.png</url>
<title>Ars Technica</title>
<link>https://arstechnica.com</link>
<width>32</width>
<height>32</height>
</image>
<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.arstechnica.com/arstechnica/index" />
<feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="arstechnica/index" />
<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" />
<item>
<title>Trump announces a COVID-19 Emergency Use Authorization for blood plasma</title>
<link>https://arstechnica.com/?p=1700815</link>
<pubDate>Sun, 23 Aug 2020 23:26:43 +0000</pubDate>
<dc:creator><![CDATA[John Timmer]]></dc:creator>
<category><![CDATA[Policy]]></category>
<category><![CDATA[Science]]></category>
<category><![CDATA[antibodies]]></category>
<category><![CDATA[COVID-19]]></category>
<category><![CDATA[Emergency Use Authorization]]></category>
<category><![CDATA[fda]]></category>
<category><![CDATA[plasma]]></category>
<category><![CDATA[SARS-CoV-2]]></category>
<category><![CDATA[Trump]]></category>
<guid isPermaLink="false">https://arstechnica.com/?p=1700815</guid>
<description><![CDATA[Move reportedly happens over objections of health officials about limited data.]]></description>
<content:encoded><![CDATA[<div id="rss-wrap">
<figure class="intro-image intro-left"><img src="https://cdn.arstechnica.net/wp-content/uploads/2020/08/GettyImages-1220853125-800x604.jpg" alt="Image of a man gesturing towards another."><p class="caption" style="font-size:0.8em"><a href="https://cdn.arstechnica.net/wp-content/uploads/2020/08/GettyImages-1220853125.jpg" class="enlarge-link" data-height="773" data-width="1024">Enlarge</a> <span class="sep">/</span> Donald Trump gestures to Stephen Hahn, head of the FDA, at an earlier press conference. (credit: <a rel="nofollow" class="caption-link" href="https://www.gettyimages.com/detail/news-photo/president-donald-trump-and-stephen%C2%A0hahn-director-of-the-news-photo/1220853125?adppopup=true">Drew Angerer</a>)</p> </figure><div><a name="page-1"></a></div>
<p>Today, President Trump held a news conference to announce that the FDA has granted an Emergency Use Authorization for the treatment of COVID-19 cases using blood plasma from those formerly infected. The move comes despite significant uncertainty regarding just how effective this treatment is, and comes just days after Trump attacked the FDA for delaying its work as part of a plot to sabotage his re-election.</p>
<h2>In the blood</h2>
<p>Plasma is the liquid portion of the blood, which (among other things) contains antibodies. It has been used to treat other infections, as some antibodies can be capable of neutralizing the infecting pathogen—binding to the bacteria or virus in a way that prevents it from entering cells. Early studies have indicated that it's relatively common for those who have had a SARS-CoV-2 infection to generate antibodies that can neutralize the virus in lab tests, although the antibody response to the virus is also <a href="https://arstechnica.com/science/2020/06/antibody-testing-suggests-immune-response-post-covid-is-very-variable/">highly variable</a>.</p>
<p>In the absence of any effective treatments, people started testing this "convalescent plasma" <a href="https://arstechnica.com/science/2020/03/hospitals-in-nyc-will-start-testing-therapy-using-plasma-of-those-infected/">as early as March</a>, and testing has been expanded as the pool of post-infected individuals has continued to grow. But so far, the evidence has been mixed. One of the largest studies, led by researchers at the Mayo Clinic and including over 35,000 patients, <a href="https://doi.org/10.1101/2020.08.12.20169359">did see an effect</a>, but it was a very mild one: mortality dropped from 11.9 percent in people who received plasma four days or more after starting treatment, compared with 8.7 percent if treatment was started earlier than that. But, critically, the study lacked a control group, leaving its authors talking about "signatures of efficacy," rather than actual evidence of efficacy.</p></div><p><a href="https://arstechnica.com/?p=1700815#p3">Read 7 remaining paragraphs</a> | <a href="https://arstechnica.com/?p=1700815&comments=1">Comments</a></p><div class="feedflare">
<a href="http://feeds.arstechnica.com/~ff/arstechnica/index?a=P614tPda1Eo:cpKHHEXnQ5w:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/arstechnica/index?i=P614tPda1Eo:cpKHHEXnQ5w:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.arstechnica.com/~ff/arstechnica/index?a=P614tPda1Eo:cpKHHEXnQ5w:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/arstechnica/index?i=P614tPda1Eo:cpKHHEXnQ5w:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.arstechnica.com/~ff/arstechnica/index?a=P614tPda1Eo:cpKHHEXnQ5w:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/arstechnica/index?d=qj6IDK7rITs" border="0"></img></a> <a href="http://feeds.arstechnica.com/~ff/arstechnica/index?a=P614tPda1Eo:cpKHHEXnQ5w:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/arstechnica/index?d=yIl2AUoC8zA" border="0"></img></a>
</div>]]></content:encoded>
</item>
</channel>
</rss>
Attempting to run this as-is yields the following error:
Cannot construct instance of `readerless.service.feed.Rss2Parser$RssItem` (although at least one Creator exists): no String-argument constructor/factory method to deserialize from String value ('Move reportedly happens over objections of health officials about limited data.')
at [Source: (StringReader); line: 39, column: 111] (through reference chain: readerless.service.feed.Rss2Parser$RssDocument["channel"]->readerless.service.feed.Rss2Parser$RssChannel["item"]->java.util.ArrayList[1])
com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot construct instance of `readerless.service.feed.Rss2Parser$RssItem` (although at least one Creator exists): no String-argument constructor/factory method to deserialize from String value ('Move reportedly happens over objections of health officials about limited data.')
at [Source: (StringReader); line: 39, column: 111] (through reference chain: readerless.service.feed.Rss2Parser$RssDocument["channel"]->readerless.service.feed.Rss2Parser$RssChannel["item"]->java.util.ArrayList[1])
If I remove <description/>
, the same error will be thrown for <content:encoded/>
. If I remove both elements, the errors go away - even though there are multiple other CDATA
elements further up. If I move the two elements so that they occur before <guid/>
, the error goes away and both elements are parsed without any issues.
If, to resolve the error, I add a constructor to RssItem
that accepts a String
, that constructor will be called for both <description/>
and <content:encoded/>
, which seems highly unusual - it's almost as if those two are being parsed as separate <item/>
nodes themselves.
I'm using version 2.11.2, though I've tried multiple other versions. I've also tried using woodstox-core
, version 5.2.1.0, but the error persists. I've also tried marking up both content
and encodedContent
@JacksonXmlCData
and @JacksonXmlText
, to no avail.