Question

我试图在HTML中找到标签的所有元素并获得起点和终点。

这是我的示例HTML

some content <iframe></iframe> <iframe></iframe> another content

这是我到目前为止所获得的代码。

$ dom = HtmlDomParser :: str_get_html（$ this-＆gt; content）;

$iframes = array();
foreach( $dom->find( 'iframe' ) as $iframe) {
    $iframes[] = $iframe;
}

return array(
    'hasIFrame' =>  count( $iframes ) > 0
);

获取元素的数量很简单，但我不确定HTMLDomParser是否可以获得起始位置和结束位置？

我想要的是

array( 
 'hasIFrame' => true,
 'numberOfElements => 2,
 array ( 
  0 => array (
   'start' => $firstStartingElement,
   'end'   => $firstEndingElement
  ),
  1 => array ( 
   'start' => $secondStartingElement,
   'end'   => $secondEndingElement
  )
)

Answer 1

如果您查看官方文档（http://simplehtmldom.sourceforge.net/），您可以轻松找到DOM中有多少类型的元素：

// Find all images 
foreach($html->find('img') as $element) {
       echo $element->src . '<br>';
}

您需要做的就是检索$ html-＆gt; find（'iframe'）并验证其大小，以确定是否至少有一次

Answer 2

您可以这样做：

$html = "some content <iframe></iframe> <iframe></iframe> another content";
preg_match_all('/<iframe>/', $html, $iframesStartPositions, PREG_OFFSET_CAPTURE);
preg_match_all('/<iframe\/>/', $html, $iframesEndPositions, PREG_OFFSET_CAPTURE);


$iframesPositions = array();
foreach( $dom->find( 'iframe' ) as $key => $iframe) {
    $iframesPositions[] = array(
      'start' => $iframesStartPositions[0][$key][1],
      'end'   => $iframesEndPositions[0][$key][1] + 9 // 9 is the length of the ending tag <iframe/>
    );
}

return array(
    'hasIFrame'        =>  count($iframesPositions) > 0,
    'numberOfElements' => count($iframesPositions),
    'positions'        =>  $iframesPositions
);

如何使用PHP查找HTML中的所有元素并获取所有位置？

2 个答案: