Question

任何人都可以帮我写一个preg_match规则来检测输入字符串是否是unicode代码字符？

以下是字符列表：

http://www.utf8-chartable.de/unicode-utf8-table.pl?start=9728&number=128&utf8=string-literal

我想写一个方法来检测输入字符串是否是表情符号

function detectEmoticons($input) {
    if (preg_match("/REGEX/", $input)) {
        return TRUE;
    } else {
        return FALSE;
    }
}

如果输入是一个字符串，如“\ xe2 \ x98 \ x80”或“\ xe2 \ x98 \ x81”......等（列表http://www.utf8-chartable.de/unicode-utf8-table.pl?start=9728&number=128&utf8=string-literal中可用的所有字符），那么它应该返回< / p>

TRUE

先谢谢，
UTTAM

Answer 1

首先，如果您希望正则表达式与unicode一起使用，请使用u修饰符。其次，对[\x{2600}-\x{267F}]范围内的所有字符使用字符类（即U + 2600到U + 267F）。现在您可以将您的函数编写为：

function detectEmoticons($input){
    if(preg_match("/[\x{2600}-\x{267F}]/u", $input)){
    return TRUE;
    }
    else{
    return FALSE;
    }
}

Answer 2

要在正则表达式中匹配Unicode字符，您必须添加 u 修饰符

示例：

function detectEmoticons($input) {
    if (preg_match("/REGEX/u", $input)) {
        return TRUE;
    } else {
        return FALSE;
    }
}

如果你必须检索其中一个，你可以传递像

这样的字符范围

/[\x{START}-\x{END}]/u

或者使用mb_strpos函数检查所有字符

实施例

function detectEmoticons($input) {
    $characters = array("\xe2", "\x98", ...);

    foreach ($characters as $v) {
        if (mb_strpos($input, $v) !== false)
            return true;
    }

    return false;
}

您可以在此处找到文档： http://ch1.php.net/manual/en/reference.pcre.pattern.modifiers.php http://ch2.php.net/manual/en/function.mb-strpos.php

Answer 3

试试这个

preg_match("/\\[a-zA-Z0-9_-]{1,}\\[a-zA-Z0-9_-]{1,}\\[a-zA-Z0-9_-]{3}/", $input);

使用preg_replace来scape

preg_replace("/\\[a-zA-Z0-9_-]{1,}\\[a-zA-Z0-9_-]{1,}\\[a-zA-Z0-9_-]{3}/",'', $input);

像\ xe2 \ x98 \ xba这样的unicode字符的正则表达式

3 个答案: