Question

在尝试解析博客的RSS源时，我遇到了一个问题。虽然每个元素都能很好地进入我的班级，但是包含实际内容的元素总是空的。

<content:encoded>THIS IS FULL OF HTML </content:encoded>

这似乎没有解析的XML行。它也是唯一一个带冒号的，也是唯一一个包含HTML数据的冒号。其他人看起来像这样。

<title>
An amazing Title
</title>
<link>
More Junk
</link>
<comments>
Comments and things
</comments>

我的代码如下，这使得其他所有元素都很好。有什么想法吗？

allPosts = (from x in feed.Descendants("item")
                         select new blogPost
                         {
                             Creator = (string)x.Element("creator"),
                             Title = (string)x.Element("title"),
                             Published = DateTime.Parse((string)x.Element("pubDate")),
                             Content = (string)x.Element("content"),
                             Description = (string)x.Element("description"),
                             Link = (string)x.Element("link"),
                         }).ToList<blogPost>();

由于

Answer 1

看起来你正在寻找内容而不是编码。内容是与编码元素关联的XML Namespace。您需要为它定义一个合适的XNamespace并将其添加到您的查询中：

XNamespace contentNS = "<whatever the namespace is>";

allPosts = (from x in feed.Descendants("item")
                         select new blogPost
                         {
                             Creator = (string)x.Element("creator"),
                             Title = (string)x.Element("title"),
                             Published = DateTime.Parse((string)x.Element("pubDate")),

                             // Looking for content:encoded
                             Content = (string)x.Element(contentNS + "encoded"),

                             Description = (string)x.Element("description"),
                             Link = (string)x.Element("link"),
                         }).ToList<blogPost>();

contentNS的值取决于原始XML中的内容，尝试在根元素中查找xmlns：content定义。

使用XDocument / XElement解析XML数据

1 个答案: