Alpha数值preg_replace忽略单字符

时间:2019-12-23 09:54:29

标签: php regex unicode

我正在尝试使用preg_replace替换除chars nums和unicode chars之外的所有内容。 这是我尝试过并遇到的错误:

  

preg_replace():编译失败:字符类在偏移量22处的排列范围混乱

这是我的正则表达式:

[^A-Za-z0-9 \x{0080}-\x{FFFF}]

我想将文本转换为以下示例:

CAFÉ? CREATORS WERE HERE!#1 => CAFÉ CREATORS WERE HERE1

-编辑-

我尝试了以下解决方案,并收到此错误:

$str = 'CAFÉ? CREATORS WERE HERE!#1';


$alphaNumStr = preg_replace('/[^A-Za-z0-9 x{0080}-x{FFFF}]/u', '', $str);

echo 'TEXT: ' . $alphaNumStr;
  

TEXT:preg_replace():编译失败:字符类的范围乱序,第4行的偏移量20

2 个答案:

答案 0 :(得分:2)

您需要将u(unicode)标志添加到您的正则表达式中:

$text = 'CAFÉ? CREATORS WERE HERE!#1';
echo preg_replace('/[^A-Za-z0-9 \x{0080}-\x{FFFF}]/u', '', $text);

输出

CAFÉ CREATORS WERE HERE1

Demo on 3v4l.org

答案 1 :(得分:2)

如果要保留所有语言的所有字母,请使用:

$str = 'CAFÉ? CREATORS WERE HERE!#1';
echo preg_replace('/[^\p{L}\d\s]+/u', '', $str);

输出:

CAFÉ CREATORS WERE HERE1

\p{L}代表任何字母。

Further reading