我试图创建一个匹配字符串中所有相似单词/短语的模式。
例如,我需要匹配:"这","这是","这是","", "那是","那不是"。
它仅匹配"此"的第一次出现,但它应匹配所有出现。
我甚至尝试过锚点和单词边界,但似乎没有任何效果。
我试过(简化):
$content = "this is it! that was not!";
preg_match_all('/(this|this is|this is it|that|that was|that was not)/i', $content, $results);
应该输出:
答案 0 :(得分:1)
怎么样:
$content = "this is it";
preg_match_all('/(?=(this))(?=(this is))(?=(this is it))/i', $content, $results);
print_r($results);
根据评论进行编辑:
$content = "this is it";
preg_match_all('/(?=(this))(?=(this is))(?=(this is it))|(?=(that))(?=(that was))(?=(that was not))/i', $content, $results);
print_r($results);
<强>输出:强>
Array
(
[0] => Array
(
[0] =>
[1] =>
)
[1] => Array
(
[0] => this
[1] =>
)
[2] => Array
(
[0] => this is
[1] =>
)
[3] => Array
(
[0] => this is it
[1] =>
)
[4] => Array
(
[0] =>
[1] => that
)
[5] => Array
(
[0] =>
[1] => that was
)
[6] => Array
(
[0] =>
[1] => that was not
)
)
更普遍:
$content = "this is it! that was not!";
preg_match_all('/\b(?=(\w+))(?=(\w+ \w+))(?=(\w+ \w+ \w+))\b/i', $content, $results);
print_r($results);
<强>输出:强>
Array
(
[0] => Array
(
[0] =>
[1] =>
)
[1] => Array
(
[0] => this
[1] => that
)
[2] => Array
(
[0] => this is
[1] => that was
)
[3] => Array
(
[0] => this is it
[1] => that was not
)
)
答案 1 :(得分:1)
问题是最短字符串选项首先出现在或组中:
/(this|this is|this is it)/i
PHP将检查测试字符串是否包含从左到右的(this|this is|this is it)
项。一旦在测试字符串中找到匹配项,它就会离开该组。
这将起作用,因为PHP将首先搜索最长的字符串:
/(this is it|this is|this)/i
答案 2 :(得分:1)
鉴于您只是捕获了您要搜索的字词,最好只使用foreach
循环以及substr_count
来查看每个字符的次数字符串出现。
例如:
$haystack = "this is it! that was not! this is not a test!";
$needles = array(
"this",
"this is",
"this is it",
"that",
"that was",
"that was not");
foreach ($needles as $needle) {
// substr_count is case sensitive, so make subject and search lowercase
$hits = substr_count(strtolower($haystack), strtolower($needle));
echo "Search '$needle' occurs $hits time(s)" . PHP_EOL;
}
以上将输出:
Search 'this' occurs 2 time(s)
Search 'this is' occurs 2 time(s)
Search 'this is it' occurs 1 time(s)
Search 'that' occurs 1 time(s)
Search 'that was' occurs 1 time(s)
Search 'that was not' occurs 1 time(s)
如果substr_count
没有提供您所需的灵活性,那么您始终可以使用preg_match_all
替换它,并使用您的个人$needle
值作为搜索字词。
答案 3 :(得分:0)
您也可以使用以下正则表达式。
/(this(?:\sis(?:\sit)?)?)/i