preg_match_all和umlets

时间:2013-02-16 05:50:24

标签: php regex preg-match-all

我使用preg_match_all来过滤字符串

我在preg_match_all中提供的字符串是

$text = "Friedric'h Wöhler"

之后我使用

preg_match_all('/(\"[^"]+\"|[\\p{L}\\p{N}\\*\\-\\.\\?]+)/', $text, $arr, PREG_PATTERN_ORDER);

我打印$ arr时得到的结果是

Array
(
    [0] => Array
        (
            [0] => friedric
            [1] => h
            [2] => w
            [3] => ouml
            [4] => hler
        )

    [1] => Array
        (
            [0] => friedric
            [1] => h
            [2] => w
            [3] => ouml
            [4] => hler
        )

)

不知何故,ö角色被ouml取代,我不确定如何解决这个问题

我期待以下结果

Array
(
    [0] => Array
        (
            [0] => Friedric'h 
            [1] => Wöhler
        )

)

2 个答案:

答案 0 :(得分:1)

Per nhahtdh的评论:

$text = "Friedric'h Wöhler";
preg_match_all('/"[^"]+"|[\p{L}\p{N}*.?\\\'-]+/u', $text, $arr, PREG_PATTERN_ORDER);
echo "<pre>";
print_r($arr);
echo "</pre>";

给出

Array
(
    [0] => Array
        (
            [0] => Friedric'h
            [1] => Wöhler
        )

)

答案 1 :(得分:0)

如果您认为preg_match_all()太乱了,可以看看T-Regx tool

$p = '"[^"]+"|[\p{L}\p{N}*.?\\\'-]+';  // automatic delimiters
$text = "Friedric'h Wöhler";

$result = pattern($p)-match($text)->all();

您还可以对其进行迭代或使用filter()map() / forEach()方法

$result = pattern($p)-match($text)
    ->filter(function (Match $match) {
        return strlen($match->text()) > 0;
    })
    ->forEach(function (Match $m) {
        echo "I matched $m";
    });