大型XML数据的更好的Linq解析

时间:2018-10-10 10:20:14

标签: c# xml linq

我有一个应用程序,该应用程序接收多个xml文件并执行查找以创建一个csv文件,我注意到数据并不总是100%,即缺少结果或2,所以我想出了我处理该文件的方式数据不正确且不正确,因此非常感谢这里专家提供的帮助。

小型XML示例:

<?xml version="1.0" encoding="utf-8"?>
<lookupdb xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:sample:lookupdb:0.1">
    <References>
          <Reference id="3cb7ceb0-43c7-4c67-a7fb-fffb32fc71c4">
            <Vehicle>Beach_Buggy_01</Vehicle>
            <Engineers>
              <Engineer>Joe Bloggs</Engineer>
            </Engineers>
            <IsActive>true</IsActive>
            <Owner>Bill Bloggs</Owner>
            <Serviced>True</Serviced>
            <OwnerName>Bill</OwnerName>
            <CostID>ABCDEF123456</CostID>
            <FuelType>Petrol</FuelType>
            <Phone>1234567890</Phone>
            <Address>Some Address</Address>
          </Reference>
          <Reference id="d1053bd3-a1cb-4fb4-a7d5-ffee3e10ffdb">
            <Vehicle>Transit</Vehicle>
            <Engineers>
              <Engineer>Joe Bloggs2</Engineer>
            </Engineers>
            <IsActive>true</IsActive>
            <Owner>Andy Bloggs</Owner>
            <Serviced>True</Serviced>
            <OwnerName>Andy</OwnerName>
            <CostID>9345089</CostID>
            <FuelType>Petrol</FuelType>
            <Phone>1234567890</Phone>
            <Address>Some Address4</Address>
          </Reference>
          <Reference id="30f8cfe8-40fd-4c99-9c7d-5ab98f8e5620">
            <Vehicle>Ford Fiesta</Vehicle>
            <Engineers>
              <Engineer>Steve Bloggs</Engineer>
            </Engineers>
            <IsActive>true</IsActive>
            <Owner>Sarah H</Owner>
            <Serviced>True</Serviced>
            <OwnerName>Bill</OwnerName>
            <CostID>834hsdfgs</CostID>
            <FuelType>Petrol</FuelType>
            <Phone>1234567890</Phone>
            <Address>Some Address3</Address>
          </Reference>
    </References>
    <Sessions>
        <RentalSession id="cc5d9960-3a80-4fd9-b7d6-0963198567c3">
              <VehicleRefId>3cb7ceb0-43c7-4c67-a7fb-fffb32fc71c4</VehicleRefId>
              <RentalPeriod startDate="2018-10-02T07:46:34Z" endDate="2018-10-02T08:27:36Z" />
              <HiringInfo HireId="2e428f42-f8f1-4603-9570-fed1fa78e470" customerId="1929936734" customerRefId="6da73407-f443-491d-9cad-c4fed9bfb71f" />
              <Notes>Vehicle Broke Down Recovery ordered</Notes>
              <VehicleGroup>ATV</VehicleGroup>
        </RentalSession>
        <RentalSession id="829221a2-196e-403a-bdcb-9759959cfa70">
              <VehicleRefId>3cb7ceb0-43c7-4c67-a7fb-fffb32fc71c4</VehicleRefId>
              <RentalPeriod startDate="2018-10-03T07:46:34Z" endDate="2018-10-04T08:27:36Z" />
              <HiringInfo HireId="4fb2cd21-9f48-44de-ae72-01ce4eeccdf9" customerId="2929936735" customerRefId="0a2d3d8b-ab06-4cd1-9ec5-aea4ac3f6da3" />
              <Notes>Returned on Time no Damage</Notes>
              <VehicleGroup>ATV</VehicleGroup>
        </RentalSession>
        <RentalSession id="68a6b485-d30a-439a-8081-8c09f724d23b">
              <VehicleRefId>d1053bd3-a1cb-4fb4-a7d5-ffee3e10ffdb</VehicleRefId>
              <RentalPeriod startDate="2018-10-05T07:46:34Z" endDate="2018-10-05T08:27:36Z" />
              <HiringInfo HireId="c4022764-7fc2-4415-97bf-57d616e3b8bd" customerId="3929936736" customerRefId="cb260bfc-34c1-4ac5-befa-17f69b2406bb" />
              <Notes>Scratch to Door Charges applied</Notes>
              <VehicleGroup>VANS</VehicleGroup>
        </RentalSession>
        <RentalSession id="c4083f9a-65ee-4693-8488-e299271064b1">
              <VehicleRefId>30f8cfe8-40fd-4c99-9c7d-5ab98f8e5620</VehicleRefId>
              <RentalPeriod startDate="2018-10-09T07:46:34Z" endDate="2018-10-09T08:27:36Z" />
              <HiringInfo HireId="cb260bfc-34c1-4ac5-befa-17f69b2406bb" customerId="4929936737" customerRefId="c4022764-7fc2-4415-97bf-57d616e3b8bd" />
              <Notes>Generally a rubbish vehicle</Notes>
              <VehicleGroup>Small Cars</VehicleGroup>
        </RentalSession>
    </Sessions>
</lookupdb>

用户名是程序的主要查找内容,同时还需要工程师,因为会话中的VehicleRefId与参考ID相匹配,而大部分数据都是从租用会话中获取的;但是,从一些本地测试中,我发现首先获取会话数据似乎更好,但不能完全确定采用这种方法,这是我认为需要查看的代码:

1:获取租赁数据

 var result = xDoc.Descendants().Descendants(ns + "RentalSession")
                            .Where(x => x.Element(ns + "VehicleRefId").Value != null)
                            .Select(x => new
                            {
                                _VehicleRefId = GetResultValue(true, x, "VehicleRefId", "VehicleRefId", "Vehicle Reference ID"),
                                _RentalSessionId = GetResultValue(false, x, "RentalSession", "id", "Session ID"),
                                _startDate = GetResultValue(false, x, "RentalPeriod", "startDate", "Start date"),
                                _endDate = GetResultValue(false, x, "RentalPeriod", "endDate", "End date"),
                                _VehicleGroup = GetResultValue(true, x, "VehicleGroup", "VehicleGroup", "Vehicle Group"),
                                _Notes = GetResultValue(true, x, "Notes", "Notes", "Event Notes")
                            }).ToList().Distinct();

2:在出租数据查询中看到的方法:

private string GetResultValue(bool isNode, XElement atrr_value,string nodeName, string xattr_Name, string value_text)
{
    string retValue = "";
    try
    {
        switch(isNode)
        {
            case true:
                    retValue = !string.IsNullOrEmpty((string)atrr_value.Element(ns + nodeName).Value)
                                       ? (string)atrr_value.Element(ns + nodeName).Value
                                          : $"No {value_text} Found.";
                    break;
            default:
                    if(nodeName == "RentalSession")
                    {
                        retValue = !string.IsNullOrEmpty((string)atrr_value.Attribute(xattr_Name).Value)
                                       ? (string)atrr_value.Attribute(xattr_Name).Value
                                          : $"No {value_text} Found.";
                    }
                    else
                    {
                        retValue = !string.IsNullOrEmpty((string)atrr_value.Element(ns + nodeName).Attribute(xattr_Name).Value)
                                       ? (string)atrr_value.Element(ns + nodeName).Attribute(xattr_Name).Value
                                          : $"No {value_text} Found.";
                    }
                    break;
        }
    }
    catch(Exception rex)
    {
        retValue = "null";
    }

    return retValue;
}

3:获取所有者和工程师数据:

foreach(var itemData in result)
{
    try
    {
        var references = xDoc.Descendants().Descendants(ns + "Reference")
                         .Where(
                                a => a.Attribute("id").Value == itemData._VehicleRefId
                               )
                         .Select(a => new
                         {
                                _OwnerName = a.Element(ns + "OwnerName").Value,
                                _Engineer = a.Elements(ns + "Engineers").Descendants(ns + "Engineer").Select(e => e.Value).Single()
                         }).FirstOrDefault();

                         ... Further parsing 
    catch (Exception xEx)
    {
        //some error handling stuff
    }
}

非常感谢您的帮助,以便了解我在学习和精简此部分代码方面的不足。

非常感谢。

编辑:上面的xml仅显示数据的一部分,将有多个引用和会话,并且某些会话将匹配相同的引用。

2 个答案:

答案 0 :(得分:1)

不要使用“ Value”属性,该属性在元素为null时会出现问题。而是在下面执行强制转换代码

var result = xDoc.Descendants().Descendants(ns + "RentalSession")
                            .Where(x => x.Element(ns + "VehicleRefId").Value != null)
                            .Select(x => new
                            {
                                _VehicleRefId = (string)x.Element("VehicleRefId"),
                                _RentalSessionId = (string)x.Element("RentalSession),
                                _startDate = (DateTime)x.Element("RentalPeriod),
                                _endDate = (DateTime)x.Element("RentalPeriod"),
                                _VehicleGroup = (string)x.Element("VehicleGroup"),
                                _Notes = (string)x.Element("Notes")
                            }).ToList().Distinct();

答案 1 :(得分:1)

https://github.com/hyperledger/composer/issues/4375获取与XmlSerializer兼容的类。

将您的Xml复制并粘贴到上面链接的编辑器中,然后单击Convert按钮以获取您的课程。

通过这种方法,您可以通过类对象的[Dot]表示法轻松获取所需的xml节点或属性值。

在这里,我创建了一个控制台应用程序用于演示。

class Program
{
    static void Main(string[] args)
    {
        var xml = System.IO.File.ReadAllText(@"C:\Users\Nullplex6\source\repos\ConsoleApp4\ConsoleApp4\Files\XMLFile9.xml");
        var serializer = new XmlSerializer(typeof(Lookupdb));
        using (var reader = new StringReader(xml))
        {
            Lookupdb lookupdb = (Lookupdb)serializer.Deserialize(reader);

            //Here you can get any xml node and attribute value of single "RentalSession" in "Sessions" by passig id to where clause
            RentalSession rentalSession = lookupdb.Sessions.RentalSession.Where(x => x.Id == "68a6b485-d30a-439a-8081-8c09f724d23b").FirstOrDefault();

            Console.WriteLine("Id: " + rentalSession.Id);
            Console.WriteLine("VehicleRefId: " + rentalSession.VehicleRefId);
            Console.WriteLine("EndDate: " + rentalSession.RentalPeriod.EndDate);
            Console.WriteLine("VehicleGroup: " + rentalSession.VehicleGroup);

            Console.WriteLine();

            //Here you can get any xml node and attribute value of single "Reference"  in "References" by passig id to where clause
            Reference reference = lookupdb.References.Reference.Where(x => x.Id == "d1053bd3-a1cb-4fb4-a7d5-ffee3e10ffdb").FirstOrDefault();

            Console.WriteLine("OwnerName: " + reference.OwnerName);
            Console.WriteLine("Engineer: " + reference.Engineers.Engineer);

            Console.ReadLine();
        }
    }
}

输出:

xmltocsharp