将包含重复标记的XML文件转换为CSV

时间:2015-06-19 13:30:13

标签: linq-to-xml

我正在寻找一种方法,将XML文件转换为CSV文件,并将标记名称作为标题。但是,我的问题是XML文件有一些重复的标签,所以它使解析文件的过程和&#34;发现&#34;元素很难。我的问题出在<File>标签上。

这是我尝试转换的文件的片段。另外,请注意<File>标记是动态的,因此它可以包含0个甚至10个<File>标记。

<Program>
    <ProgramName>Enrolled Nursing Assistant</ProgramName>
    <Category>Nursing</Category>
    <Credential>Enrolled Nurse</Credential>
    <ProgramLevel>-</ProgramLevel>
    <StartDate>27-09-2004</StartDate>
    <CompletionDate>05-05-2006</CompletionDate>
    <Institution>
        <InstitutionType>College</InstitutionType>
        <SchoolName>EXCELSIOR COMMUNITY COLLEGE(MAIN)</SchoolName>
        <PrimaryLanguage>English</PrimaryLanguage>
        <LanguageOfInstruction>
            <Theory>English</Theory>
            <Clinical>English</Clinical>
        </LanguageOfInstruction>    
        <Address>
            <StreetAddress1>-</StreetAddress1>
            <StreetAddress2>-</StreetAddress2>
            <POBox>-</POBox>
            <City>-</City>
            <State>-</State>
            <Country iso-code='876'>Jamaica</Country>
            <PostalCode>-</PostalCode>
        </Address>
    </Institution>
    <Documents>
        <Document>
            <DocumentType>TRANSCRIPT</DocumentType>
            <DocumentNumber>001</DocumentNumber>
            <IssuedFrom iso-code='876'>Country</IssuedFrom>
            <DateIssued>-</DateIssued>
            <ReceivedDate>28-05-2014</ReceivedDate>
            <Files>
                <File>
                    <Name>001.tiff</Name>
                    <Path>images\education</Path>
                    <Extension>tiff</Extension>
                    <Size>36000</Size>
                    <LastModifiedDate>28-05-2014</LastModifiedDate>
                </File>
                <File>
                    <Name>7002.tiff</Name>
                    <Path>images\education</Path>
                    <Extension>tiff</Extension>
                    <Size>38000</Size>
                    <LastModifiedDate>28-05-2014</LastModifiedDate>
                </File>
                <File>
                    <Name>003.tiff</Name>
                    <Path>images\education</Path>
                    <Extension>tiff</Extension>
                    <Size>50000</Size>
                    <LastModifiedDate>28-05-2014</LastModifiedDate>
                </File>
            </Files>
        </Document>
    </Documents>
</Program>

我有一个将其转换为CSV的解决方案,但它不会处理重复的标记。它只是继续使用第一个<File>标记的详细信息。因此,在我的CSV文件中会有3个文件的标签,但它们都会有第一个标签的详细信息。

var xml = XDocument.Load(@"C:/path/7123451_53957.xml");
string program = "Program";

Func<XDocument, IEnumerable<string>> getFields =
    xd =>
        xd
            .Descendants(program)
            .SelectMany(d => d.Descendants())
            .Select(e => e.Name.ToString());

var headers =
    String.Join(",",
        getFields(xml)
            .Select(f => csvFormat(f)));

var programQuery =
    (from programs in xml.Descendants(program)
        select string.Join(",",
            getFields(xml)
        .Select(f => programs.Descendants(f).Any()
            ? programs.Descendants(f).First().Value
            : "")
        .Select(x => csvFormat(x))))
    .ToList();

我认为问题出在programs.Descendants(f).First().Value部分。

0 个答案:

没有答案