我已经看到了这个问题,但我没有看到答案..
所以我收到了这个错误:
The ':' character, hexadecimal value 0x3A, cannot be included in a name.
关于此代码:
XDocument XMLFeed = XDocument.Load("http://feeds.foxnews.com/foxnews/most-popular?format=xml");
XNamespace content = "http://purl.org/rss/1.0/modules/content/";
var feeds = from feed in XMLFeed.Descendants("item")
select new
{
Title = feed.Element("title").Value,
Link = feed.Element("link").Value,
pubDate = feed.Element("pubDate").Value,
Description = feed.Element("description").Value,
MediaContent = feed.Element(content + "encoded")
};
foreach (var f in feeds.Reverse())
{
....
}
项目如下:
<rss>
<channel>
....items....
<item>
<title>Pentagon confirms plan to create new spy agency</title>
<link>http://feeds.foxnews.com/~r/foxnews/most-popular/~3/lVUZwCdjVsc/</link>
<category>politics</category>
<dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/" />
<pubDate>Tue, 24 Apr 2012 12:44:51 PDT</pubDate>
<guid isPermaLink="false">http://www.foxnews.com/politics/2012/04/24/pentagon-confirms-plan-to-create-new-spy-agency/</guid>
<content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[|http://global.fncstatic.com/static/managed/img/Politics/panetta_hearing_030712.jpg<img src="http://feeds.feedburner.com/~r/foxnews/most-popular/~4/lVUZwCdjVsc" height="1" width="1"/>]]></content:encoded>
<description>The Pentagon confirmed Tuesday that it is carving out a brand new spy agency expected to include several hundred officers focused on intelligence gathering around the world.&amp;#160;</description>
<dc:date xmlns:dc="http://purl.org/dc/elements/1.1/">2012-04-4T19:44:51Z</dc:date>
<feedburner:origLink>http://www.foxnews.com/politics/2012/04/24/pentagon-confirms-plan-to-create-new-spy-agency/</feedburner:origLink>
</item>
....items....
</channel>
</rss>
我想要的只是获得&#34; http://global.fncstatic.com/static/managed/img/Politics/panetta_hearing_030712.jpg" ;,然后检查内容:编码是否存在..
感谢。
修改 我找到了一个示例,我可以显示并编辑试图处理它的代码..
EDIT2: 我以丑陋的方式做到了:
text.Replace("content:encoded", "contentt").Replace("xmlns:content=\"http://purl.org/rss/1.0/modules/content/\"","");
然后以正常方式获取元素:
MediaContent = feed.Element("contentt").Value
答案 0 :(得分:0)
您应该使用XNamespace:
XNamespace content = "...";
// later in your code ...
MediaContent = feed.Element(content + "encoded")
查看更多详情here。
(当然,您要分配给内容的字符串与xmlns:content="..."
中的字符串相同。)
答案 1 :(得分:0)
以下代码
static void Main(string[] args)
{
var XMLFeed = XDocument.Parse(
@"<rss>
<channel>
....items....
<item>
<title>Pentagon confirms plan to create new spy agency</title>
<link>http://feeds.foxnews.com/~r/foxnews/most-popular/~3/lVUZwCdjVsc/</link>
<category>politics</category>
<dc:creator xmlns:dc='http://purl.org/dc/elements/1.1/' />
<pubDate>Tue, 24 Apr 2012 12:44:51 PDT</pubDate>
<guid isPermaLink='false'>http://www.foxnews.com/politics/2012/04/24/pentagon-confirms-plan-to-create-new-spy-agency/</guid>
<content:encoded xmlns:content='http://purl.org/rss/1.0/modules/content/'><![CDATA[|http://global.fncstatic.com/static/managed/img/Politics/panetta_hearing_030712.jpg<img src='http://feeds.feedburner.com/~r/foxnews/most-popular/~4/lVUZwCdjVsc' height='1' width='1'/>]]></content:encoded>
<description>The Pentagon confirmed Tuesday that it is carving out a brand new spy agency expected to include several hundred officers focused on intelligence gathering around the world.&amp;#160;</description>
<dc:date xmlns:dc='http://purl.org/dc/elements/1.1/'>2012-04-4T19:44:51Z</dc:date>
<!-- <feedburner:origLink>http://www.foxnews.com/politics/2012/04/24/pentagon-confirms-plan-to-create-new-spy-agency/</feedburner:origLink> -->
</item>
....items....
</channel>
</rss>");
XNamespace contentNs = "http://purl.org/rss/1.0/modules/content/";
var feeds = from feed in XMLFeed.Descendants("item")
select new
{
Title = (string)feed.Element("title"),
Link = (string)feed.Element("link"),
pubDate = (string)feed.Element("pubDate"),
Description = (string)feed.Element("description"),
MediaContent = GetMediaContent((string)feed.Element(contentNs + "encoded"))
};
foreach(var item in feeds)
{
Console.WriteLine(item);
}
}
private static string GetMediaContent(string content)
{
int imgStartPos = content.IndexOf("<img");
if(imgStartPos > 0)
{
int startPos = content[0] == '|' ? 1 : 0;
return content.Substring(startPos, imgStartPos - startPos);
}
return string.Empty;
}
结果:
{ Title = Pentagon confirms plan to create new spy agency, Link = http://feeds.f oxnews.com/~r/foxnews/most-popular/~3/lVUZwCdjVsc/, pubDate = Tue, 24 Apr 2012 1 2:44:51 PDT, Description = The Pentagon confirmed Tuesday that it is carving out a brand new spy agency expected to include several hundred officers focused on intelligence gathering around the world. , MediaContent = http://global .fncstatic.com/static/managed/img/Politics/panetta_hearing_030712.jpg } Press any key to continue . . .
几点:
这已不再与问题相关,但可能对某些人有帮助,所以我要离开
考虑到编码元素的内容,它在CDATA部分内。 CDATA部分内部不是Xml而是纯文本。 CDATA通常用于不必编码'&lt;','&gt;','&amp;'字符(没有CDATA,它们必须编码为&lt;&gt;和&amp;不破坏Xml文档本身),但Xml处理器将CDATA中的字符视为编码(或编码它们更正确) 。如果你想嵌入html,CDATA很方便,因为在文本上嵌入的内容看起来像原始的,但如果html不是格式良好的Xml,它不会破坏你的xml。由于CDATA内容不是Xml,而是文本,因此无法将其视为Xml。您可能需要将其视为文本并使用例如正则表达式。如果您知道它是有效的Xml,您可以再次将内容加载到XElement并进行处理。在你的情况下,你有混合的内容,所以除非你使用一些脏的黑客,这是不容易做到的。如果您只有一个顶级元素而不是混合内容,那么一切都会很简单。黑客是添加元素,以避免所有的麻烦。在foreach看来你可以做这样的事情:
var mediaContentXml = XElement.Parse("<content>" + (string)item.MediaContent + "</content>");
Console.WriteLine((string)mediaContentXml.Element("img").Attribute("src"));
同样它不漂亮而且它是一个黑客但如果编码元素的内容是有效的Xml它将起作用。更正确的方法是将我们的XmlReader与ConformanceLevel设置为Fragment并适当地识别所有类型的节点以创建相应的Linq到Xml节点。