C#中复杂的嵌套XML解析

时间:2013-08-06 14:12:20

标签: c# xml parsing linq-to-xml

<ndActivityLog repositoryId="AA-AAAA1AAA" repositoryName="Company Name" startDate="2013-07-05" endDate="2013-07-06">
    <activity date="2013-07-05T06:42:35" name="open" host="00.00.00.00">
        <user id="joebloggs@email.com" name="Joe Bloggs" memberType="I" /> 
        <storageObject docId="0000-0000-0000" name="Opinion" size="356864" fileExtension="doc">
            <cabinet name="Client and Matters">NG-5MIYABBV</cabinet> 
            <DocumentType>Legal Document</DocumentType> 
            <Author>Joe Bloggs</Author> 
            <Matter>1001</Matter> 
            <Client>R1234</Client> 
        </storageObject>
    </activity>
</ndActivityLog>

这是XML的一个例子。文档中有大约4000个“活动”元素,内容不同。有些具有“客户”和“物质”元素,有些则没有。要把它想象成一个表,这些将是空白单元格,但列标题仍然存在。

我基本上需要将其解析为SQL数据库,保留数据结构。除此之外,如果在某些示例中不存在元素,则需要引用该事实并将其保留为“空白单元格”。

 var doc = XDocument.Load(path + "\\" + file + ".xml");

        var root = doc.Root;
        foreach (XElement el in root.Elements())
        {

               // Console.WriteLine(el.Nodes());
                //  Console.WriteLine(el.Value);
                //Console.WriteLine("  Attributes:");
                foreach (XAttribute attr in el.Attributes())
                {

                    Console.WriteLine(attr);
                 //   Console.WriteLine(el.Elements("id"));


                }

           Console.WriteLine("---------------------------");

          // foreach (XElement element in el.Elements())
       //    {

     //          Console.WriteLine("    {0}: {1}", element.Name, element.Value);
      //     }

           }
            //hold console open
            Console.ReadLine();

        }

到目前为止的代码。输出如下所示

date="2013-07-06T17:07:42"
name="open"
host="213.146.142.50

我基本上需要提取每一条信息,所以我可以将它们存储在一个表格布局中。 我是使用XML解析的新手,所以任何帮助都会受到赞赏。

4 个答案:

答案 0 :(得分:0)

只有您知道允许的属性名称柜...客户端。简单的暴力方法是提取每个预期的属性,然后您将知道哪些属性丢失并且可以将单元格设置为空。 Foreach只会迭代每个元素上的内容 - 它无法猜出丢失的元素。

答案 1 :(得分:0)

我认为您可以通过以下方式解决问题:

  1. 您创建了一个名为BaseNode的类。

  2. 您可以为所有实体类型创建扩展BaseNode的类

  3. 您创建一组基于节点确定首选实体类型的规则

  4. 您在BaseNode类中创建了generateEntity方法。

  5. 您使用此算法(这不是代码,因此请勿尝试编译)

  6. parseXML(节点)

    for each node in node do
    
        BaseNode.generateEntity(node.input)
    
        if (node.hasChildren())
    
            parseXML(node)
    
        end if
    
    end for
    

    结束parseXML

    当然,您必须存储和解析生成的实体。

答案 2 :(得分:0)

说这是解决您的特定问题的最佳或正确的方法,但是,我提供它作为您可以做的事情的简略示例(因此缺少异常/错误处理等)。

namespace so.consoleapp
{
    using System;
    using System.Collections.Generic;
    using System.Xml.Linq;

    class Program
    {
        static void Main(string[] args)
        {
            var doc = XElement.Load("file.xml");
            var activityElements = doc.Elements("activity");

            ICollection<Activity> collectionOfActivities = new List<Activity>();
            foreach (var activityElement in activityElements)
            {
                var storageObjectElement = activityElement.Element("storageObject");

                string clientElement = null;
                if (storageObjectElement.Element("Client") != null)
                {
                    clientElement = storageObjectElement.Element("Client").Value;
                }

                var newStorageObject = new StorageObject
                {
                    Client = clientElement,
                    Author = storageObjectElement.Element("Author").Value
                };

                var userElement = activityElement.Element("user");
                var newUser = new User
                {
                    Id = userElement.Attribute("id").Value,
                    Name = userElement.Attribute("name").Value,
                    MemberType = userElement.Attribute("memberType").Value
                };

                collectionOfActivities.Add
                (
                    new Activity
                    {
                        Date = activityElement.Attribute("date").Value,
                        Name = activityElement.Attribute("name").Value,
                        Host = activityElement.Attribute("host").Value,
                        User = newUser,
                        StorageObject = newStorageObject
                    }
                );
            }

            Console.ReadLine();
        }
    }

    class Activity
    {
        public string Date
        {
            get;
            set;
        }

        public string Name
        {
            get;
            set;
        }

        public string Host
        {
            get;
            set;
        }

        public User User
        {
            get;
            set;
        }

        public StorageObject StorageObject
        {
            get;
            set;
        }
    }

    class User
    {
        public string Id
        {
            get;
            set;
        }

        public string Name
        {
            get;
            set;
        }

        public string MemberType
        {
            get;
            set;
        }
    }

    class StorageObject
    {
        public string Client
        {
            get;
            set;
        }

        public string Author
        {
            get;
            set;
        }
    }
}

答案 3 :(得分:0)

尝试类似的东西。创建一个新的Windows Forms Application,将一个DataGrid控件添加到表单和代码后面,如下所示:

private void Form1_Load(object sender, EventArgs e)
        {
            populate_datagrid(dataGridView1);
        }

        private void populate_datagrid(DataGridView dataGridView1)
        {
            String xml_string = @"<ndActivityLog repositoryId=""AA-AAAA1AAA"" repositoryName=""Company Name"" startDate=""2013-07-05"" endDate=""2013-07-06"">
                                    <activity date=""2013-07-05T06:42:35"" name=""open"" host=""00.00.00.00"">
                                        <user id=""joebloggs@email.com"" name=""Joe Bloggs"" memberType=""I"" /> 
                                        <storageObject docId=""0000-0000-0000"" name=""Opinion"" size=""356864"" fileExtension=""doc"">
                                            <cabinet name=""Client and Matters"">NG-5MIYABBV</cabinet> 
                                            <DocumentType>Legal Document</DocumentType> 
                                            <Author>Joe Bloggs</Author> 
                                            <Matter>1001</Matter> 
                                            <Client>R1234</Client> 
                                        </storageObject>
                                    </activity>
                                    <activity date=""2013-06-05T06:42:35"" name=""close"" host=""00.00.00.00"">
                                        <user id=""abc@bca.com"" name=""abc"" memberType=""I"" /> 
                                        <storageObject docId=""0000-0000-0000"" name=""Opinion"" size=""25630"" fileExtension=""doc"">
                                            <cabinet name=""Client and Matters"">NG-5MIYABBV</cabinet> 
                                            <DocumentType>Legal Document</DocumentType> 
                                            <Author>abc</Author> 
                                            <Client>R1234</Client> 
                                        </storageObject>
                                    </activity>
                                    <activity date=""2013-06-05T06:42:35"" name=""unknown"" host=""00.00.00.00"">
                                        <user id=""bca@abc.com"" name=""bca"" memberType=""I"" /> 
                                        <storageObject docId=""0000-0000-0000"" name=""Opinion"" size=""45875"" fileExtension=""doc"">
                                            <cabinet name=""Client and Matters"">NG-5MIYABBV</cabinet> 
                                            <DocumentType>Legal Document</DocumentType> 
                                            <Author>bca</Author> 
                                            <Matter>1001</Matter> 
                                        </storageObject>
                                    </activity>
                                    <activity date=""2013-06-05T06:42:35"" name=""open"" host=""00.00.00.00"">
                                        <user id=""cab@abc.com"" name=""cab"" memberType=""I"" /> 
                                        <storageObject docId=""0000-0000-0000"" name=""Opinion"" size=""45875"" fileExtension=""doc"">
                                            <cabinet name=""Client and Matters"">NG-5MIYABBV</cabinet> 
                                            <DocumentType>Legal Document</DocumentType>
                                        </storageObject>
                                    </activity>
                                </ndActivityLog>";

            var query = from XElement c in System.Xml.Linq.XElement.Parse(xml_string).Descendants("activity")
                        select new
                        {
                            user = c.Elements("user").First().Attribute("name").Value,
                            author = c.Descendants("Author").Count() > 0 ? c.Descendants("Author").First().Value : "n/a",
                            matter = c.Descendants("Matter").Count() > 0 ? c.Elements("Matter").First().Value : "n/a"
                        };

            dataGridView1.DataSource = query.ToList();

        }

希望这会有所帮助。