Question

我有这样的字符串： Suède · Slovénie

我需要通过·来爆炸它我尝试了各种解决方案，如：

preg_split("/[?·]/",strip_tags($single->children(2)->outertext))

explode(chr(149), strip_tags($single->children(2)->outertext)); 

explode(utf8_encode('·'),strip_tags($single->children(2)->outertext));

explode('·',strip_tags($single->children(2)->outertext));

但没有一个解决方案适合我！任何人都可以告诉我吗？

Answer 1

您应该使用mb_split()：

  var_dump(mb_split('·', 'Suède · Slovénie'));

给出

array(2) {
  [0]=>
  string(7) "Suède "
  [1]=>
  string(10) " Slovénie"
}

Answer 2

这似乎适用于给定的字符串，但可能不适用于所有字符串。

preg_split("/\b (\W+) \b/", $str);

Answer 3

您的文件最有可能使用Utf-8。在Utf-8 ·由两个字节（0xC2,0xB7）组成，像"/[?·]/"这样的表达式将在这些字节中的任何一个上中断。相反，您必须使用u修饰符来使用Utf-8模式：

$ php -r 'print_r(preg_split("/[?·]/u", "Suède·Slovénie"));'
Array
(
    [0] => Suède
    [1] => Slovénie
)

更好的是使用mb_split()多字节感知拆分功能，但并不总是可用。

Answer 4

您似乎正在使用simplehtmldom且未正确对字符进行编码，请按以下方式使用str_get_html：

//mb_convert_encoding will try to detect the `$html` encoding and convert it to `UTF-8`
$html =  str_get_html(mb_convert_encoding(file_get_contents("http://somesite.com"), 'auto', 'UTF-8'));

然后，你可以简单地使用：

explode('·',strip_tags($single->children(2)->outertext));

Answer 5

我找到了解决方案，· =·我们需要把这个问题放在一边。

explode('&middot;',$str);

特殊性格爆发·

5 个答案: