我在$string
中有此HTML:
$string = '<p>random</p>
<a href="">Test 1</a> (target1)
<br>
<a href="">Test 2</a> (target1)
<br>
<a href="">Test 3</a> (skip)
// etc
';
我在$array
中有几句话:
$array = array(
'(target1)',
'(target2)'
);
如何搜索$string
来查找$array
中的所有术语并获取其前面的<a>
标记的内容?
所以我最终得到以下结果:
$results = array(
array(
'text' => 'Test 1',
'needle' => 'target1'
),
array(
'text' => 'Test 2',
'needle' => 'target1'
)
);
答案 0 :(得分:1)
我会用javascript给你答案,但是php可以做同样的事情。
您可以一次搜索整个数组1个字符串,一旦找不到结果并且到达数组末尾,就结束搜索。
target1Match = s.match(/<.+?>(.+?)<\/.+?> *\(target1\)/);
// target1Match is now [<a href="">Test 1</a> (target1), Test 1]
target1Match = target1Match[1];
target2Match = s.match(/<.+?>(.+?)<\/.+?> *\(target2\)/);
// target1Match is now [<a href="">Test 2</a> (target2), Test 2]
target2Match = target2Match[1];
您使用变量“ target1和2”构建正则表达式
匹配多个目标和特定标签
s.match(/<a.+?>(.+?)<\/a> *\((target1|target2)\)/);
答案 1 :(得分:0)
// Assuming your HTML as $str, your terms as $terms
$results = [];
foreach ($terms as $t) {
// Get content of <a> tag preceeding the term
preg_match_all('/<a ?.*>(.*)<\/a>\s+' . preg_quote($t) . '/', $str, $matches);
//Then insert into your result array
foreach ($matches[1] as $m) {
$results[] = [
'text' => $m,
'needle' => $t
];
}
}
输出:
// echo '<pre>' . print_r($results, true) . '</pre>';
Array
(
[0] => Array
(
[text] => Test 1
[needle] => (target1)
)
[1] => Array
(
[text] => Test 2
[needle] => (target1)
)
)
另请参阅:preg_quote()
答案 2 :(得分:0)
我在JayBlanchard集中营。这是一种将DomDocument&Xpath与动态生成的查询正确结合使用的解决方案,以定位<a>
标签,标签后面紧跟着包含合格标记之一的文本。
对于样品针,这是生成的查询:
//a[following-sibling::text()[1][contains(.,'(target1)') or contains(.,'(target2)')]]
代码:(Demo)
$html = '<p>random</p>
<a href="">Test 1</a> (skip)
<br>
<a href="">Test 2</a> (target1)
<br>
<a href="">Test 3</a> (target1)
<br>
<a href="">Test 4</a> (skip)
<br>
<a href="">Test 5</a> (target2)
<br>
<a href="">Test 6</a> (skip)
';
$needles = [
'(target1)',
'(target2)'
];
$contains = array_reduce($needles, function($carry, $needle) {
return $carry .= ($carry !== null ? ' or ' : '') . "contains(.,'$needle')";
});
$matches = [];
$dom=new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//a[following-sibling::text()[1][$contains]]") as $node) {
$matches[] = ["text" => $node->nodeValue, "needle" => trim($node->nextSibling->nodeValue)];
}
var_export($matches);
输出:
array (
0 =>
array (
'text' => 'Test 2',
'needle' => '(target1)',
),
1 =>
array (
'text' => 'Test 3',
'needle' => '(target1)',
),
2 =>
array (
'text' => 'Test 5',
'needle' => '(target2)',
),
)