如何在单节点中获取XML标记

时间:2018-07-16 08:05:59

标签: c# xml

我试图用C#编写一个工具,该工具将接收任何XML文件(具有未定义的结构),并将生成名称和值的列表。除了XML包含其中包含标记的description节点的实例外,我基本上都能正常工作。以下面的XML示例为例:

<TESTROOT>
  <MAIN>
    <TITLE>This is a test title</TITLE>
    <VERSION>v1.0</VERSION>
  </MAIN>
  <CONTACT>
    <NAME>Some contact person</NAME>
    <ADDRESS>
      <LINE1>Address line 1</LINE1>
      <LINE2>Address line 2</LINE2>
      <TOWN>Some town here</TOWN>
      <POSTCODE>AN1 WH3</POSTCODE>
    </ADDRESS>
  </CONTACT>
  <DETAIL>
    <NOTES>
      <P>Some text may appear like this in markup tags.</P>
      <P>But is all contained within the NOTES node.</P>
      <P>These may appear in different places, not necessarily called NOTES.</P>
      <P>And may contain <a href="#">Some hyperlinks</a></P>
    </NOTES>
  </DETAIL>
</TESTROOT>

我正在使用以下代码逐步浏览上述XML,并获取节点路径和值:

public void RunMe()
{
    XmlDocument doc = new XmlDocument();
    doc.LoadXml(testxmlstring);

    // Get root node
    XmlNode root = doc.SelectSingleNode("//TESTROOT");

    // Get node info recursive
    GetNode(root, "");
}

public void GetNode(XmlNode parent, string path)
{
    foreach(XmlNode n in parent.ChildNodes)
    {
        if (n != null)
            Console.WriteLine(String.Format("{0} = {1}", path, n.Value));

        if (parent.HasChildNodes)
            GetNode(n, path + @"\" + n.Name);
    }
}

使用该示例XML和上面的代码,我得到以下输出:

\MAIN\TITLE = This is a test title
\MAIN\VERSION = v1.0
\CONTACT\NAME = Some contact person
\CONTACT\ADDRESS\LINE1 = Address line 1
\CONTACT\ADDRESS\LINE2 = Address line 2
\CONTACT\ADDRESS\TOWN = Some town here
\CONTACT\ADDRESS\POSTCODE = AN1 WH3
\DETAIL\NOTES\P = Some text may appear like this in markup tags.
\DETAIL\NOTES\P = But is all contained within the NOTES node.
\DETAIL\NOTES\P = These may appear in different places, not necessarily called NOTES.
\DETAIL\NOTES\P =     And may contain 
\DETAIL\NOTES\P\a = Some hyperlinks

如您所见,由于P标签(或其中出现的任何HTML标签),NOTES节点被分成多行。我真正想要的是这个...

\MAIN\TITLE = This is a test title
\MAIN\VERSION = v1.0
\CONTACT\NAME = Some contact person
\CONTACT\ADDRESS\LINE1 = Address line 1
\CONTACT\ADDRESS\LINE2 = Address line 2
\CONTACT\ADDRESS\TOWN = Some town here
\CONTACT\ADDRESS\POSTCODE = AN1 WH3
\DETAIL\NOTES = <P>Some text may appear like this in markup tags.</P><P>But is all contained within the NOTES node.</P><P>These may appear in different places, not necessarily called NOTES.</P><P>And may contain <a href="#">Some hyperlinks</a></P>

因此,在冗长的介绍之后,我的问题是-有没有办法获得上面显示的输出。是否可以检查标记并将节点内的所有标记作为一个值获得?

谢谢, S

1 个答案:

答案 0 :(得分:1)

我写了这个:

class Program
{
    static void Main(string[] args)
    {
        XmlMapper xmlMapper = new XmlMapper("xml.xml");
        Console.WriteLine("TEST WITHOUT BLACKLIST:\n");
        xmlMapper.PrintMap();
        Console.WriteLine("\nTEST WITH BLACKLIST:\n");
        xmlMapper.PrintMap(new List<string>() { "P" });
    }
}

class XmlMapper
{
    public string FilePath { get; private set; }
    public XDocument XDocument { get; private set; }

    public XmlMapper(string filePath)
    {
        LoadXML(filePath);
    }

    public void LoadXML(string filePath)
    {
        this.FilePath = filePath;
        this.XDocument = XDocument.Load(FilePath);
    }

    public void PrintMap(List<string> blacklist = null)
    {
        PrintElements(XDocument.Elements().ToList(), "", blacklist);
    }

    private void PrintElements(List<XElement> elements, string path, List<string> blacklist = null)
    {
        foreach (XElement element in elements)
        {
            string elementPath = path + "\\" + element.Name;

            if (blacklist != null && blacklist.Contains(element.Name.LocalName) == true)
            {
                Console.WriteLine(string.Format("{0} = {1}", elementPath, element?.ToString()));
                continue;
            }
            else
            {
                Console.WriteLine(string.Format("{0} = {1}", elementPath, element?.Value));
            }

            if (element.HasElements)
            {
                PrintElements(element.Elements().ToList(), elementPath, blacklist);
            }
        }
    }
}

输出:

TEST WITHOUT BLACKLIST:

\TESTROOT = This is a test titlev1.0Some contact personAddress line 1Address line 2Some town hereAN1 WH3Some text may appear like this in markup tags.But is all contained within the NOTES node.These may appear in different places, not necessarily called NOTES.And may contain Some hyperlinks
\TESTROOT\MAIN = This is a test titlev1.0
\TESTROOT\MAIN\TITLE = This is a test title
\TESTROOT\MAIN\VERSION = v1.0
\TESTROOT\CONTACT = Some contact personAddress line 1Address line 2Some town hereAN1 WH3
\TESTROOT\CONTACT\NAME = Some contact person
\TESTROOT\CONTACT\ADDRESS = Address line 1Address line 2Some town hereAN1 WH3
\TESTROOT\CONTACT\ADDRESS\LINE1 = Address line 1
\TESTROOT\CONTACT\ADDRESS\LINE2 = Address line 2
\TESTROOT\CONTACT\ADDRESS\TOWN = Some town here
\TESTROOT\CONTACT\ADDRESS\POSTCODE = AN1 WH3
\TESTROOT\DETAIL = Some text may appear like this in markup tags.But is all contained within the NOTES node.These may appear in different places, not necessarily called NOTES.And may contain Some hyperlinks
\TESTROOT\DETAIL\NOTES = Some text may appear like this in markup tags.But is all contained within the NOTES node.These may appear in different places, not necessarily called NOTES.And may contain Some hyperlinks
\TESTROOT\DETAIL\NOTES\P = Some text may appear like this in markup tags.
\TESTROOT\DETAIL\NOTES\P = But is all contained within the NOTES node.
\TESTROOT\DETAIL\NOTES\P = These may appear in different places, not necessarily called NOTES.
\TESTROOT\DETAIL\NOTES\P = And may contain Some hyperlinks
\TESTROOT\DETAIL\NOTES\P\a = Some hyperlinks

TEST WITH BLACKLIST:

\TESTROOT = This is a test titlev1.0Some contact personAddress line 1Address line 2Some town hereAN1 WH3Some text may appear like this in markup tags.But is all contained within the NOTES node.These may appear in different places, not necessarily called NOTES.And may contain Some hyperlinks
\TESTROOT\MAIN = This is a test titlev1.0
\TESTROOT\MAIN\TITLE = This is a test title
\TESTROOT\MAIN\VERSION = v1.0
\TESTROOT\CONTACT = Some contact personAddress line 1Address line 2Some town hereAN1 WH3
\TESTROOT\CONTACT\NAME = Some contact person
\TESTROOT\CONTACT\ADDRESS = Address line 1Address line 2Some town hereAN1 WH3
\TESTROOT\CONTACT\ADDRESS\LINE1 = Address line 1
\TESTROOT\CONTACT\ADDRESS\LINE2 = Address line 2
\TESTROOT\CONTACT\ADDRESS\TOWN = Some town here
\TESTROOT\CONTACT\ADDRESS\POSTCODE = AN1 WH3
\TESTROOT\DETAIL = Some text may appear like this in markup tags.But is all contained within the NOTES node.These may appear in different places, not necessarily called NOTES.And may contain Some hyperlinks
\TESTROOT\DETAIL\NOTES = Some text may appear like this in markup tags.But is all contained within the NOTES node.These may appear in different places, not necessarily called NOTES.And may contain Some hyperlinks
\TESTROOT\DETAIL\NOTES\P = <P>Some text may appear like this in markup tags.</P>
\TESTROOT\DETAIL\NOTES\P = <P>But is all contained within the NOTES node.</P>
\TESTROOT\DETAIL\NOTES\P = <P>These may appear in different places, not necessarily called NOTES.</P>
\TESTROOT\DETAIL\NOTES\P = <P>And may contain <a href="#">Some hyperlinks</a></P>