正则表达式:匹配带有特殊字符的单词

时间:2011-03-21 13:41:47

标签: php regex

我正在尝试找到一个与字符串中的单词匹配的正则表达式(确切的单词)。问题是这个单词有“#”等特殊字符。特殊字符可以是任何UTF-8字符,例如(“áéíóúñ#@”),它必须忽略标点符号。

我举了一些我正在寻找的例子:

Searching:#myword

 Sentence: "I like the elephants when they say #myword" <- MATCH
 Sentence: "I like the elephants when they say #mywords" <- NO MATCH
 Sentence: "I like the elephants when they say myword" <-NO MATCH
 Sentence: "I don't like #mywords. its silly" <- NO MATCH
 Sentence: "I like #myword!! It's awesome" <- MATCH
 Sentence: "I like #myword It's awesome" <- MATCH

PHP示例代码:

 $regexp= "#myword";
    if (preg_match("/(\w$regexp)/", "I like #myword!! It's awesome")) {
        echo "YES YES YES";
    } else {
        echo "NO NO NO ";
    }

谢谢!

更新:如果我查找“ myword ”,则该字必须以“w”开头而不是另一个字符。

Sentence: "I like myword!! It's awesome" <- MATCH
Sentence: "I like #myword It's awesome" <-NO MATCH

3 个答案:

答案 0 :(得分:2)

下面的解决方案是在分别考虑角色和边界时产生的。也可以有一种直接使用单词边界的可行方法。

代码:

function search($strings,$search) {
        $regexp = "/(?:[[:space:]]|^)".$search."(?:[^\w]|$)/i";
        foreach ($strings as $string) {
                echo "Sentence: \"$string\" <- " . 
                     (preg_match($regexp,$string) ? "MATCH" : "NO MATCH") ."\n";
        }
}

$strings = array(
"I like the elephants when they say #myword",
"I like the elephants when they say #mywords",
"I like the elephants when they say myword",
"I don't like #mywords. its silly",
"I like #myword!! It's awesome",
"I like #mywOrd It's awesome",
);
echo "Example 1:\n";
search($strings,"#myword");

$strings = array(
"I like myword!! It's awesome",
"I like #myword It's awesome",
);
echo "Example 2:\n";
search($strings,"myword");

输出:

Example 1:
Sentence: "I like the elephants when they say #myword" <- MATCH
Sentence: "I like the elephants when they say #mywords" <- NO MATCH
Sentence: "I like the elephants when they say myword" <- NO MATCH
Sentence: "I don't like #mywords. its silly" <- NO MATCH
Sentence: "I like #myword!! It's awesome" <- MATCH
Sentence: "I like #mywOrd It's awesome" <- MATCH
Example 2:
Sentence: "I like myword!! It's awesome" <- MATCH
Sentence: "I like #myword It's awesome" <- NO MATCH

答案 1 :(得分:1)

您应该像myword一样使用wordboundary搜索/\bmyword\b/ #本身也是一个字边界,因此/\b#myword\b/不起作用 一个想法是用\X来逃避unicode角色,但这会产生其他问题。

/ #myword\b/

答案 2 :(得分:0)

这应该可以解决问题(将“myword”替换为您想要查找的内容):

^.*#myword[^\w].*$

如果比赛成功,则找到您的单词 - 否则不会。