从兄弟节点到匹配的节点检索数据

时间:2012-12-02 22:42:51

标签: php xml xpath simplexml siblings

我正在使用SimpleXML迭代xml doc。我有一个带有id的数组($ ids),我正在检查XML(工作表/表/行/单元格/数据)中是否存在匹配项。如果它匹配,我希望能够从以下两个兄弟姐妹那里获得数据,但我无法弄清楚如何。

来自php的

// $ids <---- array('8', '53', '38')

foreach ($thePositions->Worksheet->Table->Row as $row) {

    if($row->Cell->Data == true) {

        for ($i = 0; $i < count($ids); $i++) {
            foreach($row->Cell->Data as $data) {

                if ($data == $ids[$i]) {
                    echo 'match!';

                    /* 
                       Tried $siblings = $data->xpath('preceding-sibling::* | following-sibling::*');
                       but doesn't seem to work in this case.
                    */
                }
            }
        }
    }
}

xml:

<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
  <LastAuthor>Herpa Derp </LastAuthor>
  <Created>2012-09-25T13:44:01Z</Created>
  <LastSaved>2012-09-25T13:48:24Z</LastSaved>
  <Version>14.0</Version>
 </DocumentProperties>
 <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
  <AllowPNG/>
 </OfficeDocumentSettings>
 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
  <WindowHeight>14060</WindowHeight>
  <WindowWidth>25040</WindowWidth>
  <WindowTopX>25540</WindowTopX>
  <WindowTopY>4100</WindowTopY>
  <Date1904/>
  <ProtectStructure>False</ProtectStructure>
  <ProtectWindows>False</ProtectWindows>
 </ExcelWorkbook>
 <Styles>
  <Style ss:ID="Default" ss:Name="Normal">
   <Alignment ss:Vertical="Bottom"/>
   <Borders/>
   <Font ss:FontName="Calibri" x:Family="Swiss" ss:Size="12" ss:Color="#000000"/>
   <Interior/>
   <NumberFormat/>
   <Protection/>
  </Style>
  <Style ss:ID="s62">
   <Font ss:FontName="Courier" ss:Color="#000000"/>
  </Style>
 </Styles>
 <Worksheet ss:Name="Workbook1.csv">
  <Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="79" x:FullColumns="1"
   x:FullRows="1" ss:DefaultColumnWidth="65" ss:DefaultRowHeight="15">
   <Column ss:Index="2" ss:AutoFitWidth="0" ss:Width="43"/>
   <Column ss:AutoFitWidth="0" ss:Width="113"/>
   <Column ss:Index="5" ss:AutoFitWidth="0" ss:Width="220"/>
   <Row ss:Index="6">
    <Cell ss:Index="3" ss:StyleID="s62"/>
   </Row>
   <Row>
    <Cell ss:Index="3" ss:StyleID="s62"/>
   </Row>
   <Row>
    <Cell ss:Index="3" ss:StyleID="s62"/>
   </Row>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="String">id</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="String">latitude</Data></Cell>
    <Cell><Data ss:Type="String">longitude</Data></Cell>
   </Row>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="Number">8</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">57.4999</Data></Cell>    // to be saved to $latutude
    <Cell><Data ss:Type="Number">15.8280</Data></Cell>    // to be saved to $longitude
   </Row>
   <Row>
    <Cell ss:Index="2"><Data ss:Type="Number">38</Data></Cell>
    <Cell><Data ss:Type="Number">56.5659</Data></Cell>
    <Cell><Data ss:Type="Number">16.1380</Data></Cell>
   </Row>

4 个答案:

答案 0 :(得分:1)

在这种XML的情况下,单元格总是处于相同的顺序,因此可以按如下方式完成:

$ids = array('8', '53', '38');
foreach ($xml->Worksheet->Table->Row as $row) {
    $children = $row->children();
    if (count($children) == 3 && in_array(((string) $children[0]->Data), $ids)) {
        echo 'lat: ' . $children[1]->Data . ' lng: ' . $children[2]->Data . "\n";
    }
}

答案 1 :(得分:1)

要求兄弟姐妹不起作用的原因是<Data>元素不是兄弟姐妹;他们更像堂兄弟 - 相邻<Cell>元素的孩子。

出于同样的原因,您不应该使用foreach($row->Cell->Data as $data),因为这等同于foreach($row->Cell[0]->Data as $data),即查看第一个<Data>节点的所有<Cell>子节点。由于<Data>中只有一个<Cell>元素,您也可以只编写$data = $row->Cell[0]->Data - 在这种情况下可以正常,因为您要查找的值是行的开头。

您实际需要做的是循环<Cell>foreach($row->Cell as $cell) { $data = $cell->Data; /* ... */ }

然后,您可以使用几个选项来查找相邻的单元格,包括XPath。更“PHP-ish”的方式是使用数组索引(兄弟姐妹在SimpleXML循环/数组访问中以数字方式编入索引):

foreach($row->Cell as $cell_index => $cell)
{
    $data = $cell->Data;
    if ($data == $ids[$i])
    {
        // Tip: always cast SimpleXML objects to string when you're done with their magic XMLiness
        $latitudes[$i] = (string)$row->Cell[ $cell_index + 1 ]->Data;
        $longitudes[$i] = (string)$row->Cell[ $cell_index + 2 ]->Data;
    }
}

或者,您可以依赖于您的ID始终位于第一列,以及接下来的两个中的lat和long(毕竟这是一个电子表格!)并完全避免内部循环:

if ( $row->Cell[0]->Data == $ids[$i] )
{
    $latitudes[$i] = (string)$row->Cell[1]->Data;
    $longitudes[$i] = (string)$row->Cell[2]->Data;
}

答案 2 :(得分:0)

你可以完全在XPath中完成,没有任何循环,例如:

//Row[Cell/Data[. = '8' or . = '53' or . = '38']]/following-sibling::*[position() <= 2]

在任何数据元素中搜索具有id的所有行,然后接下来的两个兄弟。

//Row[Cell[1]/Data[. = '8' or . = '53' or . = '38']]/following-sibling::*[position() <= 2]

如果确定id始终在第一个单元格中。 (这也可以防止由于id与经度/长度相同而导致的错误)

//Row[Cell[@ss:Index = "2"]/Data[. = '8' or . = '53' or . = '38']]/following-sibling::*[position() <= 2]

如果id在索引为2的单元格中。

但在所有情况下,您都需要正确初始化命名空间

答案 3 :(得分:0)

如果你有很多要匹配的ID,另一种方法是根据它们的ID创建所有行的“哈希”,然后查看该哈希而不是循环搜索匹配。

// Initialise an empty array to use as the hash
$rows_by_id = array();

// Loop over all the rows in the spreadsheet
foreach ($thePositions->Worksheet->Table->Row as $row) {
    // Skip rows with less than 3 cells
    if ( count($row->Cell) < 3 ) {
        continue;
    }

    // Take the ID from the first cell on the row
    $id = (string)$row->Cell[0]->Data;

    // Add this row to the hash, identified by it's ID
    $rows_by_id[$id] = array(
        'latitude'  => (string)$row->Cell[1]->Data,
        'longitude' => (string)$row->Cell[2]->Data
    );

    // Or if your IDs are not unique, and you want all matches:
    // $rows_by_id[$id][] = array( ... )
}

foreach ( $ids  as $required_id ) {
    // Retrieve the results from the hash
    if ( isset($rows_by_id[$required_id]) ) { 
        $matched_row = $rows_by_id[$required_id];

        echo "ID $required_id has latitude {$matched_row['latitude']} and longitude {$matched_row['longitude']}.\n";
    }
    else {
        echo "ID $required_id was not matched. :(\n";
    }

    // If you have non-unique matches, you'll need something like this:
    // $all_matched_rows = $rows_by_id[$required_id]; ... foreach ( $all_matched_rows as $matched_row )
}