使用正则表达式对多行字符串修剪线条和缩小空白

时间:2017-02-04 21:02:49

标签: php regex pcre multiline

使用php function 想要创建一个函数来修剪多行字符串中所有不必要的空格。

它不起作用的正则表达式是最后删除空格的正则表达式:

// Always trim at the end. Warning: this seems to be the costlier
// operation, perhaps because looking ahead is harder?
$patterns[] = ['/ +$/m', ''];

给出来自textarea的以下字符串:

 first  line... abc   //<-- blank space here
 second  is  here... def   //<-- blank space here
 //<-- blank space here
 fourth  line... hi  there   //<-- blank space here

 sith  is  here....   //<-- blank space here

每行的开头和结尾都有空格,加上单词之间的空格不止一个。

运行该功能后:

$functions->trimWhitespace($description, ['blankLines' => false]);

这就是我得到的:

first line... abc //<-- blank space here
second is here... def //<-- blank space here
//<-- no bank space here
fourth line... hi there //<-- blank space here

sith is here....//<-- no blank space here

为什么只从最后一行删除尾随空格?

5 个答案:

答案 0 :(得分:2)

您可以使用$动词重新定义(*ANYCRLF)匹配的位置。

请参阅以下PHP demo

$s = " ddd    \r\n  bbb     ";
$n = preg_replace('~(*ANYCRLF)\h+$~m', '', $s); // if the string can contain Unicode chars,
echo $n;                                        // also add "u" modifier ('~(*ANYCRLF)\h+$~um')

<强>详情:

  • (*ANYCRLF) - 指定换行惯例:(*CR)(*LF)(*CRLF)
  • \h+ - 1+ 水平空白字符
  • $ - 行尾(现在,在CR或LF之前)
  • ~m - 多行模式开启($匹配行尾)。

如果您想在任何Unicode换行符允许$匹配,请将(*ANYCRLF)替换为(*ANY)

请参阅PCRE reference中的 Newline conventions

(*CR)        carriage return
(*LF)        linefeed
(*CRLF)      carriage return, followed by linefeed
(*ANYCRLF)   any of the three above
(*ANY)       all Unicode newline sequences

现在,如果你需要

  • 修剪开始和结束的行
  • 将线条内的空白缩小为一个空格

使用

$s = " Ł    ę  d    \r\n  Я      ёb     ";
$n = preg_replace('~(*ANYCRLF)^\h+|\h+$|(\h){2,}~um', '$1', $s);
echo $n;

请参阅PHP demo

答案 1 :(得分:1)

使用两步法:

<?php

$text = " first  line... abc   
 second  is  here... def   
  <-- blank space here
 fourth  line... hi  there   

 sith  is  here....   ";

// get rid of spaces at the beginning and end of line
$regex = '~^\ +|\ +$~m';
$text = preg_replace($regex, '', $text);

 // get rid of more than two consecutive spaces
$regex = '~\ {2,}~';
$text = preg_replace($regex, ' ', $text);
echo $text;

?>

请参阅a demo on ideone.com

答案 2 :(得分:1)

您需要/gm而非/m

代码应该变成: (这段代码不会起作用,更新一个会这样做)

$patterns[] = ['/ +$/mg', ''];

这里的工作示例:https://regex101.com/r/z3pDre/1

<强>更新

g标识符不会像这样工作。我们需要将preg_match替换为preg_match_all

使用不带g的正则表达式,如下所示:

$patterns[] = ['/ +$/m', ''];

答案 3 :(得分:0)

preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )

所以你想要preg_replace('/[\s]+$/m', '', $string)

答案 4 :(得分:0)

 preg_replace('/*(.*) +?\n*$/', $content)

Live Demo