我正在研究一个函数来检查一个数组是否存在我不想要的特殊字符。
主要功能
public function scan()
{
$matches = array ();
foreach ($this->arr_select as $key => $value) {
if (preg_match_all(StringWorker::getRegex(), $value['filename'], $match)) {
$matches[$key]['id'] = $value['id'];
$matches[$key]['filename'] = $value['filename'];
$matches[$key]['chars'] = $match;
}
}
return $matches;
}
这是源数组
Array
(
[0] => Array
(
[id] => 2
[filename] => image_708_2_41_87026_gg_Mytrauringstore_Verlobungsringe_Saint_Maurice.jpg
)
...
[6861] => Array
(
[id] => 12322
[filename] => image_2623_йцу.JPG
)
...
[7162] => Array
(
[id] => 12699
[filename] => image_3050_Ringänderung_Service_kostenlos_Mytrauringstore.jpg
)
...
)
所以使用这个正则表达式
protected static $regex = '([^a-zA-Z0-9äöüßÄÖÜ_.\-\+]+)';
我得到了这个结果
Array
(
[6861] => Array
(
[id] => 12322
[filename] => image_2623_йцу.JPG
[chars] => Array
(
[0] => Array
(
[0] => �¹
[1] => �†
[2] => �ƒ
)
)
)
...
[7162] => Array
(
[id] => 12699
[filename] => image_3050_Ringänderung_Service_kostenlos_Mytrauringstore.jpg
[chars] => Array
(
[0] => Array
(
[0] => �ˆ
)
)
)
)
并使用此正则表达式
protected static $regex = '/[\xc2-\xdf][\x80-\xbf]/';
我得到了这个结果
Array
(
[6861] => Array
(
[id] => 12322
[filename] => image_2623_йцу.JPG
[chars] => Array
(
[0] => Array
(
[0] => Ð
[1] => ¹
[2] => Ñ
[3] => Ñ
[4] => ƒ
)
)
)
...
[7162] => Array
(
[id] => 12699
[filename] => image_3050_Ringänderung_Service_kostenlos_Mytrauringstore.jpg
[chars] => Array
(
[0] => Array
(
[0] => Ì
[1] => ˆ
)
)
)
)
所以第二个正则表达式匹配字符串中真正存在的字符。 但我也匹配这些Chars
äöüßÄÖÜ
但这些Chars应该不匹配。
那么如何管理呢?