Question

我的html代码如下所示：

<a href="The Whole World">

并希望看起来像这样：

<a href="TheWholeWorld">

使用Perl。我该怎么做呢？谢谢！

Answer 1

$html = '<a href="The Whole World">';
$html =~ s/(?<=href=")([^"]+)/ $1 =~ s!\s+!!gr /e;
print $html;

这可以通过将href="后面的文字更改为以下"来实现通过第二次替换修改文本以从中删除每个空格。

这使用Perl替换命令的r修饰符，该命令仅在更高版本的Perl中可用。如果您没有支持它的Perl版本，请使用以下命令：

$html =~ s/(?<=href=")([^"]+)/ my $text = $1; $text =~ s!\s+!!g; $text /e;

Answer 2

短代码片段

$a='<a href="the whole world">';
($c=$a)=~s/("\S+|\S+|")\s*/$1/g;
print $c;

正则表达式的工作原理：

s/("\S+|\S+|")\s*/$1/g;
      ^ ^  ^      ^   ^  ^
      + +  +      +   +  +-- global flag, apply repeatedly
      | |  |      |   +-- substitute in the first capture group
      | |  |      +-- white space, but outside of the capture group
      | |  +-- | alternative operator
      | +-- \S+ match any non zero amount of non white space
      +-- start capturing group

因此它在"中找到非空白区并将其放入捕获组

每个单词之间的空白区域不会进入捕获组

这种情况反复发生，捕获组被复制到结果中，但空格不是

最好在xml片段上使用解析器，因为从长远来看，它更容易维护

使用Perl正则表达式删除字符串部分中的所有空格

2 个答案: