子孙的Xpath

时间:2018-11-28 08:40:32

标签: c# xml

我有一个响应XML,试图在其中找到Entry标签的ID,但是任何组合总是产生null。

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<status feed="http://feeds.bbci.co.uk/news/rss.xml?edition=int" xmlns="http://superfeedr.com/xmpp-pubsub-ext">
<http code="200">Fetched (ring) 200 242 and parsed 2/42 entries</http>
<next_fetch>1970-01-18T20:24:54.289Z</next_fetch>
<entries_count_since_last_maintenance>35</entries_count_since_last_maintenance>
<velocity>65.3</velocity>
<popularity>3.713318235496007</popularity>
<generated_ids>true</generated_ids>
<title>BBC News - Home</title>
<period>242</period>
<last_fetch>1970-01-18T20:24:54.045Z</last_fetch>
<last_parse>1970-01-18T20:24:54.045Z</last_parse>
<last_maintenance_at>1970-01-18T20:24:07.350Z</last_maintenance_at>
</status>
<link title="BBC News - Home" rel="alternate" href="https://www.bbc.co.uk/news/" type="text/html"/>
<link title="BBC News - Home" rel="image" href="https://news.bbcimg.co.uk/nol/shared/img/bbc_news_120x60.gif" type="image/gif"/>
<title>BBC News - Home</title>
<updated>2018-11-15T14:59:15.000Z</updated>
<id>bbc-news-home-2018-11-15-14</id>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:geo="http://www.georss.org/georss" xmlns:as="http://activitystrea.ms/spec/1.0/" xmlns:sf="http://superfeedr.com/xmpp-pubsub-ext" xml:lang="en">
<id>https://www.bbc.co.uk/news/world-us-canada-46225486</id>
<published>2018-11-15T14:44:37.000Z</published>
<updated>2018-11-15T14:44:37.000Z</updated>
<title>Trump attacks Mueller's Russia inquiry as 'absolutely nuts'</title>
<summary type="text">The US president says the Russia inquiry is a "total mess" and calls investigators "a disgrace".</summary>
<link title="Trump attacks Mueller's Russia inquiry as 'absolutely nuts'" rel="alternate" href="https://www.bbc.co.uk/news/world-us-canada-46225486" type="text/html" xml:lang="en"/>
<link title="Trump attacks Mueller's Russia inquiry as 'absolutely nuts'" rel="thumbnail" href="http://c.files.bbci.co.uk/E64B/production/_104355985_gettyimages-1060191940.jpg" type="image/jpeg" xml:lang="en"/>
</entry>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:geo="http://www.georss.org/georss" xmlns:as="http://activitystrea.ms/spec/1.0/" xmlns:sf="http://superfeedr.com/xmpp-pubsub-ext" xml:lang="en">
<id>https://www.bbc.co.uk/news/world-africa-46221238</id>
<published>2018-11-15T14:35:47.000Z</published>
<updated>2018-11-15T14:35:47.000Z</updated>
<title>Ethiopia arrests former deputy spy chief Yared Zerihun</title>
<summary type="text">Prime Minister Abiy Ahmed promised to combat corruption and rights abuses when he took office.</summary>
<link title="Ethiopia arrests former deputy spy chief Yared Zerihun" rel="alternate" href="https://www.bbc.co.uk/news/world-africa-46221238" type="text/html" xml:lang="en"/>
<link title="Ethiopia arrests former deputy spy chief Yared Zerihun" rel="thumbnail" href="http://c.files.bbci.co.uk/52E9/production/_104352212_872d41ed-8ac9-4b7b-abfc-b4d898a71670.jpg" type="image/jpeg" xml:lang="en"/>
</entry>
</feed>

要获取ID,这些是我尝试过的组合

  1. "/feed/entry/id/text()"
  2. "entry/id/text()"
  3. `doc.GetElementsByTagName(“ entry”)。SelectNodes(“ id / text()”)I

我可以通过childNodes的迭代来获取id,但是那将是XPath。

但是,如果我在整个文档上尝试使用“ / *”,它会给我一个节点数。为什么?

2 个答案:

答案 0 :(得分:2)

xml中的xml元素在http://www.w3.org/2005/Atom xml名称空间中声明。
必须在XPATH声明中考虑此名称空间。

您必须使用XmlNamespaceManager注册该名称空间,并在x声明中将所选前缀(此处:XPATH)应用为//x:feed/x:entry/x:id

XmlDocument doc = new XmlDocument();
String pathToYourXmlFile = @"c:\folder\file.xml";
doc.Load(pathToYourXmlFile);

XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("x", "http://www.w3.org/2005/Atom");
XmlNodeList ids = doc.SelectNodes("//x:feed/x:entry/x:id", nsmgr);
foreach (XmlNode id in ids)
{
    Console.WriteLine(id.InnerText);
}

答案 1 :(得分:1)

您的xml在根级节点xmlns="http://www.w3.org/2005/Atom"上包含名称空间<feed>

您正在使用/feed/entry/id/text()这种XPath,但是这些路径不适用于此xml,这就是为什么您无法获得任何期望值的原因。

您需要使用XPath下面的内容来获取所有<entry>节点的ID。

var ids = doc.SelectNodes("//*[name()='feed']/*[name()='entry']/*[name()='id']/text()");

在这里,我创建了一个示例控制台应用程序用于演示。

class program
{
    public static void Main()
    {
        XmlDocument doc = new XmlDocument();
        doc.Load(@"Path to your xml file");

        var ids = doc.SelectNodes("//*[name()='feed']/*[name()='entry']/*[name()='id']/text()");

        foreach (XmlNode id in ids)
        {
            Console.WriteLine(id.Value);
        }

        Console.ReadLine();
    }
}

输出:

enter image description here