XPath(PHP Dom) - 如何查找带属性的span并检索以前的兄弟

时间:2016-04-07 09:56:02

标签: php html dom xpath

给出以下HTML文档的摘录:

<p>
<span m='1900'>INTERVIEWER: Today</span> 
<span m='2180'>is</span> 
<span m='2330'>October</span> 
<span m='2940'>31,</span> 
<span m='3940'>2008.</span> 
<span m='5750'>I'm</span> 
<span m='6010'>Interviewing</span> 
<span m='7700'>for</span> 
<span m='7920'>the</span> 
<span m='8050'>organisation's</span> 
<span m='9360'>sesquicentennial</span> 
<span m='10410'>Oral</span> 
<span m='10690'>History</span> 
<span m='11170'>project,</span> 
<span m='12380'>talking</span> 
<span m='12760'>about</span> 
<span m='12950'>how</span> 
<span m='13470'>things</span>
</p>

如何使用<span>属性的特定值以及之前的5个兄弟姐妹提取m

到目前为止,我有以下代码,正确识别范围:

$xpath_query = "//span[@m='" . $timestamp . "']";
$span = $xpath->query($xpath_query);

如何检索以前的兄弟姐妹?

修改

我的最终目标是创建一个HTML字符串,其中包含匹配以及之前的5个兄弟姐妹和接下来的5个兄弟姐妹。这是我到目前为止的代码:

      $output_doc = new DOMDocument;

      $xpath_query = "//span[@m='" . $timestamp . "']/preceding-sibling::span[position() < 5]";
      $previous_spans = $xpath->query($xpath_query);

      foreach($previous_spans as $previous_span) {
        $output_doc->appendChild($previous_span);
      }

      $xpath_query = "//span[@m='" . $timestamp . "']";
      $span = $xpath->query($xpath_query);
      $output_doc->appendChild($span);

      $xpath_query = "//span[@m='" . $timestamp . "']/following-sibling::span[position() < 5]";
      $next_spans = $xpath->query($xpath_query);

      foreach($next_spans as $next_span) {
        $output_doc->appendChild($next_span);
      }

编辑2: 以下是我最终使用的代码:

function get_snippets($transcript, $timestamps) {
    $doc = new DOMDocument;
    $doc->loadHTML($transcript);

    $output = array();

    $xpath = new DOMXPath($doc);
    foreach($timestamps as $timestamp) {

      $xpath_query = "//span[@m='" . $timestamp . "']/preceding-sibling::span[position() < 10]";
      $previous_spans = $xpath->query($xpath_query);

      foreach($previous_spans as $previous_span) {
        $output[$timestamp] = $output[$timestamp] . ih_video_outer_html($previous_span);
      }

      $xpath_query = "//span[@m='" . $timestamp . "']";
      $span = $xpath->query($xpath_query);

      foreach($span as $match) {
        $output[$timestamp] = $output[$timestamp] . ih_video_outer_html($match);
      }

      $xpath_query = "//span[@m='" . $timestamp . "']/following-sibling::span[position() < 10]";
      $next_spans = $xpath->query($xpath_query);

      foreach($next_spans as $next_span) {
        $output[$timestamp] = $output[$timestamp] . ih_video_outer_html($next_span);
      }
    }
}

function ih_video_outer_html($e) {
  $doc = new DOMDocument();
  $doc->appendChild($doc->importNode($e, true));
  return $doc->saveHTML();
}

1 个答案:

答案 0 :(得分:1)

//span[@m=12950] | //span[@m=12950]/preceding-sibling::span[position() < 6]

// span [@ m = 12950] - 带上你的项目

// span [@ m = 12950] / preceding-sibling :: span - 添加先前的跨度

[position()&lt; 6] - 不超过5项