如何按字符分割字符串,注意特殊字符

时间:2012-04-29 14:40:54

标签: php string split

我试图按字符爆炸字符串,但我遇到了特殊字符问题。 我目前正在使用以下功能:

<?php
$input = "Comment ça va?";
$array_input = str_split($input, 1);
print_r($array_input);
?>

这是输出:

Array (
[0] => C [1] => o [2] => m [3] => m [4] => e
[5] => n [6] => t [7] => [8] => � [9] => �
[10] => a [11] => [12] => v [13] => a [14] => ? )

我遇到了与换行符相同的问题:

输入:
“是他!
OUI?”

输出:

Array ( [0] => H [1] => � [2] => � [3] => ! [4] => 
[5] => [6] => O [7] => u [8] => i [9] => ? )

有没有人能解决这个问题? 非常感谢。

2 个答案:

答案 0 :(得分:3)

str_split有Unicode字符串问题。

您可以使用u中的preg_split修饰符

例如:

$input = "Comment ça va?";
$letters1 = str_split($input);
$letters2 = preg_split('//u', $input, -1, PREG_SPLIT_NO_EMPTY);

print_r($letters1);
print_r($letters2);

将输出

Array ( [0] => C [1] => o [2] => m [3] => m [4] => e 
        [5] => n [6] => t [7] => [8] => � [9] => � 
        [10] => a [11] => [12] => v [13] => a [14] => ? ) 

Array ( [0] => C [1] => o [2] => m [3] => m [4] => e 
        [5] => n [6] => t [7] => [8] => ç [9] => a 
        [10] => [11] => v [12] => a [13] => ? ) 

答案 1 :(得分:2)

这是因为PHP的str_split函数不是多字节安全的,即它无法正确处理Unicode。您可以使用此函数,这是str_split

的多字节安全实现
function mb_str_split( $string ) { 
    # Split at all position not after the start: ^ 
    # and not before the end: $ 
    return preg_split('/(?<!^)(?!$)/u', $string ); 
} 

(来源:PHP documentation中的用户评论)