查找不符合文件名约定的字符

时间:2017-06-23 12:50:33

标签: php regex

我正在研究一个函数来检查一个数组是否存在我不想要的特殊字符。

主要功能

public function scan()
{
  $matches = array ();
  foreach ($this->arr_select as $key => $value) {
    if (preg_match_all(StringWorker::getRegex(), $value['filename'], $match)) {
        $matches[$key]['id'] = $value['id'];
        $matches[$key]['filename'] = $value['filename'];
        $matches[$key]['chars'] = $match;
     }
  }
  return $matches;
}

这是源数组

Array
(
    [0] => Array
        (
            [id] => 2
            [filename] => image_708_2_41_87026_gg_Mytrauringstore_Verlobungsringe_Saint_Maurice.jpg
        )

    ...

    [6861] => Array
    (
        [id] => 12322
        [filename] => image_2623_йцу.JPG
    )

    ...

    [7162] => Array
    (
        [id] => 12699
        [filename] => image_3050_Ringänderung_Service_kostenlos_Mytrauringstore.jpg
    )

    ...
)

所以使用这个正则表达式

protected static $regex = '([^a-zA-Z0-9äöüßÄÖÜ_.\-\+]+)';

我得到了这个结果

Array
(
    [6861] => Array
        (
            [id] => 12322
            [filename] => image_2623_йцу.JPG
            [chars] => Array
                (
                    [0] => Array
                        (
                            [0] => �¹
                            [1] => �†
                            [2] => �ƒ
                        )

                )

        )

    ...


    [7162] => Array
        (
            [id] => 12699
            [filename] => image_3050_Ringänderung_Service_kostenlos_Mytrauringstore.jpg
            [chars] => Array
                (
                    [0] => Array
                        (
                            [0] => �ˆ
                        )

                )

        )

)

并使用此正则表达式

protected static $regex = '/[\xc2-\xdf][\x80-\xbf]/';

我得到了这个结果

Array
(
    [6861] => Array
    (
        [id] => 12322
        [filename] => image_2623_йцу.JPG
        [chars] => Array
            (
                [0] => Array
                    (
                        [0] => Ð
                        [1] => ¹
                        [2] => Ñ
                        [3] => Ñ
                        [4] => ƒ
                    )

            )

    )


    ...



    [7162] => Array
    (
        [id] => 12699
        [filename] => image_3050_Ringänderung_Service_kostenlos_Mytrauringstore.jpg
        [chars] => Array
            (
                [0] => Array
                    (
                        [0] => Ì
                        [1] => ˆ
                    )

            )

    )

)

所以第二个正则表达式匹配字符串中真正存在的字符。 但我也匹配这些Chars

äöüßÄÖÜ

但这些Chars应该不匹配。

那么如何管理呢?

0 个答案:

没有答案