Question

我已经设置了我的页面默认字符集和MySQL表格charset utf8。它在某些页面上运行良好，但在某些页面输出时，某些中文字符如'全'和'公'似乎是而在其他页面上它们可以正常输出。
唯一的区别是我意识到的正常页面和错误页面是在错误页面输出之前使用了一些ereg_replace。

                $sounds = nl2br($model->sounds);
                $sounds= preg_replace('/(\v|\s)+/', ' ', $sounds);
                $sounds= preg_replace("#(<br />|<br /> )+[< b r > \  ]*[<br />| <br /> ]+#","<br>",$sounds);
                $pattern='#[\d]+[\-]*[\d]*[\.]+#';
                if(preg_match($pattern,$sounds)&&!preg_match('#<br />|<br />|<br>#',$sounds))
                {
                    $sounds= preg_replace("#[\d]+[\-]*[\d]*[\.]+#","<br>",$sounds);
                }

这些功能可以成为原因吗？或者原因还有什么呢？

更新：我发现我发表评论$sounds= preg_replace('/(\v|\s)+/', ' ', $sounds);时效果很好，但我想使用此行删除数据中的多个空格。有什么方法可以做到这一点？

Answer 1

这很可能就是原因。使用u (UTF-8) modifier，否则正则表达式可能只匹配某些Unicode字符的部分。

此外，我注意到您提到了ereg_*，但正在使用preg_*。这很好，总是更喜欢使用preg_*而不是旧的，缓慢的和已弃用的ereg_*函数。

Answer 2

您必须在模式之后添加u修饰符，如下所示：

'/(\v|\s)+/u'

你可以在这里看到：

http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

Answer 3

您应该使用mb_ereg_replace而不是ereg_replace。

为什么我的页面上有

3 个答案: