比较bash中两个文件的内容

时间:2014-09-16 16:20:30

标签: bash grep comparison

我有两个文件tmp1.txt和tmp2.txt

tmp1.txt有

aaa.txt
bbb.txt
ccc.txt
ddd.txt
eee.txt
aab.txt

tmp2.txt有

aaa.txt
aac.txt
bbb.txt
bbd.txt
ccc.txt
ddd.txt
zzz.txt
yyy.txt

我想比较这两个文件并在bash中给我这些结果

  1. tmp1.txt和tmp2.txt中的文件,分别是aaa.txt,bbb.txt,ccc.txt,ddd.txt
  2. tmp1.txt中但不在tmp2.txt中的文件,即eee.txt,aab.txt
  3. tmp2.txt中但不在tmp1.txt中的文件,它们是aac.txt,bbd.txt,zzz.txt,yyy.txt

3 个答案:

答案 0 :(得分:2)

正如评论者所提到的,comm命令将执行您正在寻找的内容,但需要注意:文件必须先排序。幸运的是,这很容易。

$ sort tmp1.txt > tmp1_sorted.txt
$ sort tmp2.txt > tmp2_sorted.txt

然后:

$ comm tmp1_sorted.txt tmp2_sorted.txt
                aaa.txt
aab.txt
        aac.txt
                bbb.txt
        bbd.txt
                ccc.txt
                ddd.txt
eee.txt
        yyy.txt
        zzz.txt

根据man page,"没有选项,[comm]产生三列输出。第一列包含FILE1特有的行,第二列包含FILE2特有的行,第三列包含两个文件共有的行。"

如果您想单独获取列,可以通过选项-1-2-3分别取消第一列,第二列或第三列,以便获取只是第一列,例如,你会这样做:

$ comm -23 tmp1_sorted.txt tmp2_sorted.txt
aab.txt
eee.txt

答案 1 :(得分:0)

您可以使用awk滚动自己的解决方案:

awk '
NR==FNR { a[$0]++; next }
{
    print ($0 in a ? "In both: " $0: "In tmp2.txt: " $0); 
    delete a[$0]
}
END {
    for(left in a) print "In tmp1.txt: " left
}
' tmp1.txt tmp2.txt 
In both: aaa.txt
In tmp2.txt: aac.txt
In both: bbb.txt
In tmp2.txt: bbd.txt
In both: ccc.txt
In both: ddd.txt
In tmp2.txt: zzz.txt
In tmp2.txt: yyy.txt
In tmp1.txt: eee.txt
In tmp1.txt: aab.txt

可能会将其传递给sort -k2

awk '...' | sort -k2
In both: aaa.txt
In both: bbb.txt
In both: ccc.txt
In both: ddd.txt
In tmp1.txt: aab.txt
In tmp1.txt: eee.txt
In tmp2.txt: aac.txt
In tmp2.txt: bbd.txt
In tmp2.txt: yyy.txt
In tmp2.txt: zzz.txt

答案 2 :(得分:-1)

#!/bin/bash
old=file1
new=file2
cmp --silent $old $new && echo " Files are identicals " || echo "Files are different"