C#通过linq将通用xml扁平化为csv

时间:2019-05-27 23:56:02

标签: c# xml csv

SO中有很多相关问题。但是我要解决的问题看起来有些不同。

要求:

  1. 具有未知数据结构的大型xml输入字符串
  2. 孩子人数未知
  3. 元素没有属性
  4. 以元素名称作为列名称拼合为csv文件

示例输入如下:

<ARandomRoot>
  <ARandomLOne>
    <Id>12</Id>
    <OtherId>34</OtherId>    
  </ARandomLOne>
  <AnotherRandomLOne>
    <ARandomLTwo>
      <ARandomLTree>
        <NumberOfElements>2</NumberOfElements>
        <ARandomLFour>
          <RandomDataOne>R1</RandomDataOne>
          <RandomDataTwo>10.12</RandomDataTwo>          
        </ARandomLFour>
        <ARandomLFour>
          <RandomDataOne>R2</RandomDataOne>
          <RandomDataTwo>9.8</RandomDataTwo>          
        </ARandomLFour>
      </ARandomLTree>
    </ARandomLTwo>
  </AnotherRandomLOne>
</ARandomRoot>

输出应为:

ARandomRoot-ARandomLOne-Id,ARandomRoot-ARandomLOne-OtherId,ARandomRoot-AnotherRandomLOne-ARandomLTwo-ARandomLTree-NumberOfElements,ARandomRoot-AnotherRandomLOne-ARandomLTwo-ARandomLTree-ARandomLFour-RandomDataOne,ARandomRoot-AnotherRandomLOne-ARandomLTwo-ARandomLTree-ARandomLFour-RandomDataTwo
12,34,2,R1,10.12
12,34,2,R2,9.8

我距离another SO问题稍有改动的代码

        var xml = XDocument.Parse(input);

        Func<string, string> csvFormat = t => String.Format("\"{0}\"", t.Replace("\"", "\"\""));

        Func<XDocument, IEnumerable<string>> getFields =
            xd =>
                xd
                    .Descendants()
                    .SelectMany(d => d.Elements())
                    .Select(e => e.Name.ToString());

        Func<XDocument, IEnumerable<string>> getHeaders =
            xd =>
                xd
                    .Descendants()
                    .SelectMany(d => d.Elements())
                    .Select(e => e.Name.ToString())
                    .Distinct();

        var headers =
            String.Join(",",
                getHeaders(xml)
                    .Select(f => csvFormat(f)));

        var query =
            from elements in xml.Descendants()
            select string.Join(",",
                getFields(xml)
                    .Select(f => elements.Elements(f).Any()
                        ? elements.Element(f).Value
                        : "")
                    .Select(x => csvFormat(x)));

        var csv =
            String.Join(Environment.NewLine,
                new[] { headers }.Concat(query));

虽然这会产生所需的标头,但数据不会展平。

有人可以指出我正确的方向吗?

0 个答案:

没有答案