使用Perl识别字符串中的差异

时间:2014-09-10 22:23:15

标签: regex string perl difference

我有2个叮咬(句子),我想在两个字符串(句子)不匹配时找出具体差异。

我的示例代码位于

之下
my $diffpara1 = "This is paragraph 1";

my $diffpara2 = "This is paragraph 2 different from first paragraph";

my $samepara1 = "This is paragraph is same";

my $samepara2 = "This is paragraph is same";

print (($diffpara1 eq $diffpara2) ? '<span style="background-color: green">Matching</span>' : '<span style="background-color: red">Not Matching</span>');

print "<br/>".(($samepara1 eq $samepara2) ? '<span style="background-color: green">Matching</span>' : '<span style="background-color: red">Not Matching</span>');

上述代码的结果是:

Sample Output

以上仅表示字符串(句子)是否匹配或字符串(句子)是否不匹配。但我想生成一个输出,指示两个字符串(句子)的不同之处。

我想要的示例输出( BOLD 差异):

  

这是段落1这是 2与第一段不同的段落   段

我不确定我们是否可以使用REGEX来获取所需的输出。

提前感谢您的帮助。

1 个答案:

答案 0 :(得分:3)

试试Text::WordDiff。您可以将差异输出为HTML,删除和插入的部分分别标有<del><ins>标记。一个简单的例子:

use strict;
use warnings;
use feature ":5.10";
use Text::WordDiff;

my $diffpara1 = "This is paragraph 1";
my $diffpara2 = "This is paragraph 2 different from first paragraph";

# output the difference between the lines as HTML, on two lines:
my $diff = word_diff \$diffpara1, \$diffpara2, { STYLE => 'HTMLTwoLines' };

say $diff;

输出:

<div class="file"><span class="hunk">This is paragraph </span><span class="hunk"><del>1</del></span></div>
<div class="file"><span class="hunk">This is paragraph </span><span class="hunk"><ins>2 different from first paragraph</ins></span></div>

相同的行:

my $samepara1 = "This is paragraph is same";
my $samepara2 = "This is paragraph is same";
my $diff2 = word_diff \$samepara1, \$samepara2, { STYLE => 'HTMLTwoLines' };
say $diff2;

输出:

<div class="file"><span class="hunk">This is paragraph is same</span></div>
<div class="file"><span class="hunk">This is paragraph is same</span></div>

有许多不同的输出选项(另存为纯文本,另存为html,保存为文件,保存为变量等),您可以轻松配置html版本以显示不同的插入和删除文本颜色,粗体,或者你想要使用css的全能。