用PHP替换多余的空格和换行符?

时间:2011-06-18 06:57:56

标签: php regex whitespace

$string = "My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs";

echo preg_replace("/\s\s+/", " ", $string);

我阅读了PHP的文档并遵循preg_replace的教程,但是这段代码产生了

  

我的文字有很多空格,有很多空格和标签

我怎样才能把它变成:

  

我的文字有很多空白字   充足的空间和标签

10 个答案:

答案 0 :(得分:45)

首先,我想指出新行可以是\ r,\ n或\ r \ n,具体取决于操作系统。

我的解决方案:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/[\r\n]+/', "\n", $string));

如有必要,可将其分为两行:

$string = preg_replace('/[\r\n]+/', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

<强>更新

更好的解决方案是:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/\s*$^\s*/m', "\n", $string));

或者:

$string = preg_replace('/\s*$^\s*/m', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

我已经更改了正则表达式,使多行更好地分成一行。它使用“m”修饰符(使^和$匹配新行的开头和结尾)并删除任何\ s(空格,制表符,换行符,换行符)字符,它们是字符串的结尾和开头下一个。这解决了空行只有空格的问题。使用前面的示例,如果一行填充了空格,则会跳过额外的行。

答案 1 :(得分:9)

编辑正确的答案。从PHP 5.2.4开始,以下代码将执行:

echo preg_replace('/\v(?:[\v\h]+)/', '', $string);

答案 2 :(得分:4)

Replace Multiple Newline, Tab, Space

$text = preg_replace("/[\r\n]+/", "\n", $text);
$text = preg_replace("/\s+/", ' ', $text);

经过测试:)

答案 3 :(得分:4)

//Newline and tab space to single space

$from_mysql = str_replace(array("\r\n", "\r", "\n", "\t"), ' ', $from_mysql);


// Multiple spaces to single space ( using regular expression)

$from_mysql = ereg_replace(" {2,}", ' ',$from_mysql);

// Replaces 2 or more spaces with a single space, {2,} indicates that you are looking for 2 or more than 2 spaces in a string.

答案 4 :(得分:2)

替代方法:

echo preg_replace_callback("/\s+/", function ($match) {
    $result = array();
    $prev = null;
    foreach (str_split($match[0], 1) as $char) {
        if ($prev === null || $char != $prev) {
            $result[] = $char;
        }

        $prev = $char;
    }

    return implode('', $result);
}, $string);

<强>输出

My text has so much whitespace
Plenty of spaces and tabs

修改:对此进行了补充,因为它是一种不同的方法。它可能不是所要求的,但它至少不会合并不同空格的组(例如space, tab, tab, space, nl, nl, space, space将成为space, tab, space, nl, space)。

答案 5 :(得分:2)

这将 完全缩小 整个字符串(例如大型博客文章),同时保留所有HTML标记。

$email_body = str_replace(PHP_EOL, ' ', $email_body);
    //PHP_EOL = PHP_End_Of_Line - would remove new lines too
$email_body = preg_replace('/[\r\n]+/', "\n", $email_body);
$email_body = preg_replace('/[ \t]+/', ' ', $email_body);

答案 6 :(得分:1)

尝试:

$string = "My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs";
//Remove duplicate newlines
$string = preg_replace("/[\n]*/", "\n", $string); 
//Preserves newlines while replacing the other whitspaces with single space
echo preg_replace("/[ \t]*/", " ", $string); 

答案 7 :(得分:1)

为什么你这样做?
即使您使用多个空格,html也只显示一个空格......

例如:

<i>test               content 1       2 3 4            5</i>

输出将是:
测试内容1 2 3 4 5

如果您需要html中的单个空格,则必须使用&nbsp;

答案 8 :(得分:1)

将回显的数据从PHP传递到Javascript(格式为JSON)时遇到相同的问题。字符串中使用了不必要的\ r \ n和\ t字符,这些字符既不是必需的,也不显示在页面上。

我最终使用的解决方案是另一种回应方式。与preg_replace相比,这节省了很多服务器资源(这是其他人的建议)。


在此之前和之后进行比较:

之前:

echo '
<div>

    Example
    Example

</div>
';

输出:

\ r \ n \ r \ n \ tExample \ r \ n \ tExample \ r \ n \ r \ n


之后:

echo 
'<div>',

    'Example',
    'Example',

'</div>';

输出:

ExampleExample


(是的,您不仅可以将回声与点连接起来,还可以与逗号连接起来。)

答案 9 :(得分:0)

不确定这是否有用,我也不完全肯定它会按预期工作,但似乎对我有用。

一个清除多个空格以及任何您想要或不想要的东西的函数,它生成单行字符串或多行字符串(取决于传递的参数/选项)。也可以删除或保留其他语言的字符,并将换行符转换为空格。

/** ¯\_(ツ)_/¯ Hope it's useful to someone. **/
// If $multiLine is null this removes spaces too. <options>'[:emoji:]' with $l = true allows only known emoji.
// <options>'[:print:]' with $l = true allows all utf8 printable chars (including emoji).
// **** TODO: If a unicode emoji or language char is used in $options while $l = false; we get an odd � symbol replacement for any non-matching char. $options char seems to get through, regardless of $l = false ? (bug (?)interesting)
function alphaNumericMagic($value, $options = '', $l = false, $multiLine = false, $tabSpaces = "    ") {
    $utf8Emojis = '';
    $patterns = [];
    $replacements = [];
    if ($l && preg_match("~(\[\:emoji\:\])~", $options)) {
        $utf8Emojis = [
            '\x{1F600}-\x{1F64F}', /* Emoticons */
            '\x{1F9D0}-\x{1F9E6}',
            '\x{1F300}-\x{1F5FF}', /* Misc Characters */ // \x{1F9D0}-\x{1F9E6}
            '\x{1F680}-\x{1F6FF}', /* Transport and Map */
            '\x{1F1E0}-\x{1F1FF}' /* Flags (iOS) */
        ];
        $utf8Emojis = implode('', $utf8Emojis);
    }
    $options = str_replace("[:emoji:]", $utf8Emojis, $options);
    if (!preg_match("~(\[\:graph\:\]|\[\:print\:\]|\[\:punct\:\]|\\\-)~", $options)) {
        $value = str_replace("-", ' ', $value);
    }
    if ($l) {
        $l = 'u';
        $options = $options . '\p{L}\p{N}\p{Pd}';
    } else { $l = ''; }
    if (preg_match("~(\[\:print\:\])~", $options)) {
        $patterns[] = "/[ ]+/m";
        $replacements[] = " ";
    }
    if ($multiLine) {
        $patterns[] = "/(?<!^)(?:[^\r\na-z0-9][\t]+)/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options])|[^a-z0-9$options\s]/im$l";
        $patterns[] = "/\t/m";
        $patterns[] = "/(?<!^)$tabSpaces/m";
        $replacements[] = " ";
        $replacements[] = "";
        $replacements[] = $tabSpaces;
        $replacements[] = " ";
    } else if ($multiLine === null) {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[^a-z0-9$options]/im$l";
        $replacements = "";
    } else {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options\t])|[^a-z0-9$options ]/im$l";
        $replacements[] = " ";
        $replacements[] = "";
    }
    echo "\n";
    print_r($patterns);
    echo "\n";
    echo $l;
    echo "\n";
    return preg_replace($patterns, $replacements, $value);
}

用法示例:

echo header('Content-Type: text/html; charset=utf-8', true);
$string = "fjl!sj\nfl _  sfjs-lkjf\r\n\tskj 婦女與環境健康 fsl \tklkj\thl jhj ⚧ lkj ⸀ skjfl gwo lsjowgtfls s";
echo "<textarea style='width:100%; height:100%;'>";
echo alphaNumericMagic($string, '⚧', true, null);
echo "\n\nAND\n\n";
echo alphaNumericMagic($string, '[:print:]', true, true);
echo "</textarea>";

结果:

fjlsjflsfjslkjfskj婦女與環境健康fslklkjhljhj⚧lkjskjflgwolsjowgtflss

AND

fjl!sj
fl _ sfjs-lkjf
    skj 婦女與環境健康 fsl klkj hl jhj ⚧ lkj ⸀ skjfl gwo lsjowgtfls s