在尝试解析博客的RSS源时,我遇到了一个问题。虽然每个元素都能很好地进入我的班级,但是包含实际内容的元素总是空的。
<content:encoded>THIS IS FULL OF HTML
</content:encoded>
这似乎没有解析的XML行。它也是唯一一个带冒号的,也是唯一一个包含HTML数据的冒号。其他人看起来像这样。
<title>
An amazing Title
</title>
<link>
More Junk
</link>
<comments>
Comments and things
</comments>
我的代码如下,这使得其他所有元素都很好。有什么想法吗?
allPosts = (from x in feed.Descendants("item")
select new blogPost
{
Creator = (string)x.Element("creator"),
Title = (string)x.Element("title"),
Published = DateTime.Parse((string)x.Element("pubDate")),
Content = (string)x.Element("content"),
Description = (string)x.Element("description"),
Link = (string)x.Element("link"),
}).ToList<blogPost>();
由于
答案 0 :(得分:1)
看起来你正在寻找内容而不是编码。内容是与编码元素关联的XML Namespace。您需要为它定义一个合适的XNamespace并将其添加到您的查询中:
XNamespace contentNS = "<whatever the namespace is>";
allPosts = (from x in feed.Descendants("item")
select new blogPost
{
Creator = (string)x.Element("creator"),
Title = (string)x.Element("title"),
Published = DateTime.Parse((string)x.Element("pubDate")),
// Looking for content:encoded
Content = (string)x.Element(contentNS + "encoded"),
Description = (string)x.Element("description"),
Link = (string)x.Element("link"),
}).ToList<blogPost>();
contentNS的值取决于原始XML中的内容,尝试在根元素中查找xmlns:content定义。