我的意思是,擦除文本文件中重复的所有行,而不仅仅是重复行。我指的是重复的行和重复的行。这只会留给我没有重复的行列表。也许正则表达式可以在记事本++中执行此操作?但是哪一个?还有其他方法吗?
答案 0 :(得分:2)
如果您使用的是类似unix的系统,则可以使用uniq命令。
ezra@ubuntu:~$ cat test.file
ezra
ezra
john
user
ezra@ubuntu:~$ uniq -u test.file
john
user
注意,相似的行是相邻的。如果不是,你必须先对文件进行排序。
ezra@ubuntu:~$ cat test.file
ezra
john
ezra
user
ezra@ubuntu:~$ uniq -u test.file
ezra
john
ezra
user
ezra@ubuntu:~$ sort test.file | uniq -u
john
user
答案 1 :(得分:1)
如果您可以访问支持PCRE样式的正则表达式,这很简单:
s/(?:^|(?<=\n))(.*)\n(?:\1(?:\n|$))+//g
(?:^|(?<=\n)) # Behind us is beginning of string or newline
(.*)\n # Capture group 1: all characters up until next newline
(?: # Start non-capture group
\1 # backreference to what was captured in group 1
(?:\n|$) # a newline or end of string
)+ # End non-capture group, do this 1 or more times
Context是单个字符串
use strict; use warnings;
my $str =
'hello
this is
this is
this is
that is';
$str =~ s/
(?:^|(?<=\n))
(.*)\n
(?:
\1
(?:\n|$)
)+
//xg;
print "'$str'\n";
__END__
输出:
'hello
that is'