如何按位置选择XML节点(Linq或XPATH)

时间:2016-04-08 16:26:47

标签: c# xml linq xpath

一直在努力,并没有得到任何工作结果。

给出的是像这种结构的excel xml:

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
...
 </DocumentProperties>
 <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
...
 </OfficeDocumentSettings>
 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
...
 </ExcelWorkbook>
 <Styles>
...
 </Styles>
 <Worksheet ss:Name="Report">
  <Table ss:ExpandedColumnCount="41" ss:ExpandedRowCount="4082" x:FullColumns="1"
   x:FullRows="1" ss:DefaultColumnWidth="60" ss:DefaultRowHeight="15">
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_1</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_2</Data></Cell>
    ...
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_40_Active</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_1</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_2</Data></Cell>
    ...
    <Cell ss:StyleID="s62"><Data ss:Type="String">Cell_40_Active</Data></Cell>
   </Row>
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
...
  </WorksheetOptions>
 </Worksheet>
</Workbook>

目标是仅选择这些行,其中包含行的第40个单元格(ID)内的“Cell_40_Active”。喜欢:Cell [40] .Data.InnerText =“Cell_40_Active”......

        XmlDocument doc = new XmlDocument();
        doc.Load(file);
        XmlElement root = doc.DocumentElement;
        // does return all Row-elements >> working
        XmlNodeList nodes = root.GetElementsByTagName("Row");
        //does not return any element (0)
        XmlNodeList nodes = root.SelectNodes("/Worksheet/Row/Cell[40]='Cell_40_Active'");

如何做到这一点?没找到类似的东西...... 任何提示?非常感谢你。

2 个答案:

答案 0 :(得分:1)

所有元素都在命名空间urn:schemas-microsoft-com:office:spreadsheet中,因此您必须为此做好准备:

var nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("x", "urn:schemas-microsoft-com:office:spreadsheet");
XmlNodeList nodes = root.SelectNodes("<xpath expr using x prefix>", nsmgr);

根据您的描述,XPath表达式可能应该是(使用先前定义的前缀x):

/x:Workbook/x:Worksheet/x:Table/x:Row[x:Cell[40]/x:Data='Cell_40_Active']

答案 1 :(得分:0)

也许您可以利用XmlNodeList的索引函数,例如在Linq查询的上下文中,例如:

var result = from row in root.GetElementsByTagName("Row")
             where row.index(40).InnerText == "Cell_40_Active"
             //uses inner text to "skip over" the data tag.
             //this won't work if you have other child nodes with inner text.
             select row;