我正在尝试在scala中开发一个rest api,它抓取几个rss feed的xml,然后在json中显示它们。到目前为止,我可以将它们显示为文本,这很好,但我无法让作者显示出来。我正在创建一个文章列表(其中Article是一个案例类),并搜索xml以提供Article类的值。
<item>
<title>Chinese TV Star Apologizes For Remarks Critical Of Mao</title>
<description>Bi Fujian, one of the country's most popular television presenters, recently ran afoul of his employer, state-run CCTV, for a parody song he performed at a private banquet.</description>
<pubDate>Thu, 09 Apr 2015 12:51:15 -0400</pubDate>
<link>http://www.npr.org/blogs/thetwo-way/2015/04/09/398534903/chinese-tv-star-apologizes-for-remarks-critical-of-mao?utm_medium=RSS&utm_campaign=news</link>
<guid>http://www.npr.org/blogs/thetwo-way/2015/04/09/398534903/chinese-tv-star-apologizes-for-remarks-critical-of-mao?utm_medium=RSS&utm_campaign=news</guid>
<content:encoded><![CDATA[<p>Bi Fujian, one of the country's most popular television presenters, recently ran afoul of his employer, state-run CCTV, for a parody song he performed at a private banquet.</p><p><a href="http://www.npr.org/templates/email/emailAFriend.php?storyId=398534903">» E-Mail This</a></p>]]></content:encoded>
<dc:creator>Scott Neuman</dc:creator>
</item>
这是我正在解析的xml的一个例子。这是我用来解析它的代码:
def xml = XML.loadString(retrieveArticles("http://www.npr.org/rss/rss.php?id=1007")) ++ XML.loadString(retrieveArticles("http://www.npr.org/rss/rss.php?id=1003")) ++ XML.loadString(retrieveArticles("http://www.npr.org/rss/rss.php?id=1001"))
val articles = (xml \\ "item").foldLeft(List[Article]())((ls,item) => Article((item \ "title").text,
(item \ "dc:creator").text,
(item \ "pubDate").text,
(item \ "link").text,
(item \ "description").text) :: ls)
正在正确处理所有其他值。作者是唯一没有出现的价值。当我打电话给api来展示文章时,这就是我得到的:
Title: Chinese TV Star Apologizes For Remarks Critical Of Mao,
Author: ,
Date Published: Thu, 09 Apr 2015 12:51:00 -0400,
Link: http://www.npr.org/blogs/thetwo-way/2015/04/09/398534903/chinese- tv-star-apologizes-for-remarks-critical-of-mao?utm_medium=RSS&utm_campaign=news,
Contents: Bi Fujian, one of the country's most popular television presenters, recently ran afoul of his employer, state-run CCTV, for a parody song he performed at a private banquet.
为什么在显示所有其他值时没有显示作者?
答案 0 :(得分:2)
XML中的冒号:
是一个特殊字符,用于分隔标签与其(可选)前缀。因此,您要查找的元素的标签实际上是creator
,而不是dc:creator
。阅读XML here中的前缀。
如果您需要使用前缀和标签来选择元素,则可以使用prefix
属性。以下是您遇到的问题的简化版本:
val xml = <root><foo:bar/><qux:bar/></root>
xml \\ "foo:bar" // No elements found! This is the wrong selector.
xml \\ "bar" // NodeSeq(<foo:bar/>, <qux:bar/>)
(xml \\ "bar").filter(_.prefix == "foo") //NodeSeq(<foo:bar/>)
因此,在您的示例中,您只是想为作者使用(item \ "creator")
,或者在必要时过滤到dc
前缀。
作为旁注,您可以在代码中使用map
代替foldLeft
,这样会更整洁,更具惯用性:
(xml \\ "item").map { item => Article(
(item \ "title").text,
(item \ "creator").text,
(item \ "pubDate").text,
(item \ "link").text,
(item \ "description").text
)}