例如考虑一个文件sentences.txt
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence X
This is sentence Y
This is sentence Y
我们看到第一个This is sentence X
来自This is sentence Y
。
是否有任何命令来检查2行是否连续像
This is sentence X
后跟This is sentence X
或
This is sentence Y
后跟This is sentence Y
。在第11行和第12行中,我们看到重复了2行。
答案 0 :(得分:4)
您甚至不需要使用awk
!
您只需使用uniq
命令。
$ cat sentences.txt
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence X
This is sentence Y
This is sentence Y
uniq -d sentences.txt
This is sentence X
This is sentence Y
说明:
uniq是一个非常方便的命令,可以在文件中打印连续的重复项,计算它们等。在这里,我使用选项-d
来打印重复的连续行。
<强>加成:强>
如果您想在哪一行添加重复项,则可以使用以下命令:
$ cat -n sentences.txt
1 This is sentence Y
2 This is sentence X
3 This is sentence Y
4 This is sentence X
5 This is sentence Y
6 This is sentence X
7 This is sentence Y
8 This is sentence X
9 This is sentence X
10 This is sentence Y
11 This is sentence Y
$ cat -n sentences.txt | uniq -f1 -d
8 This is sentence X
10 This is sentence Y
其中-f1
用于忽略第一个字段(行编号)
最后但并非最不重要的是,如果您要打印所有重复项,请使用-D
选项。
$ cat -n sentences.txt | uniq -f1 -D
8 This is sentence X
9 This is sentence X
10 This is sentence Y
11 This is sentence Y
答案 1 :(得分:1)
awk
救援!
$ awk 'p==$0{print NR, $0} {p=$0}' file
将打印带有行号的重复行
12 This is sentence X
14 This is sentence Y
如果您不需要行号
$ awk 'p==$0; {p=$0}' file
就够了。
吸引注意力的另一种选择
$ awk 'p==$0{printf "%s", "==DUP==> "} 1; {p=$0}'
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
This is sentence Y
This is sentence X
==DUP==> This is sentence X
This is sentence Y
==DUP==> This is sentence Y