在unix中的同一目录中进行多个文件比较

时间:2016-05-13 08:49:34

标签: linux bash shell unix

我在目录中有多个文件,我想知道是否有任何工具来比较所有文件并输出差异。或者有人可以帮我写一个脚本来做到这一点? 编辑: 我有五个文件有一些值。我需要知道每个文件中的唯一值,并将它们输出到另一个文件中。

Sample1.txt
001,20160512
002,20160512
003,20160512

Sample2.txt
001,20160512
004,20160512
006,20160512

Sample3.txt
004,20160512
008,20160512
007,20160512

Sample4.txt
008,20160512
005,20160512
006,20160512

我的输出应该比较两个文件,例如Sample1.txt和Sample2.txt,并输出唯一值。例如:

Out1.txt
Unique in Sample1.txt 
002,20160512
003,20160512

Out2.txt
Unique in Sample2.txt 
004,20160512
006,20160512

等等比较Sample2.txt和Sample3.txt并输出另一个out文件中的值,并比较Sample3和Sample4,Sample1和Sample3,Sample1和Sample4,Sample2和Sample4,并在带有标题的不同文件中生成输出。

我不想使用vimdiff,因为可能有四个以上的文件。

2 个答案:

答案 0 :(得分:1)

您可以使用以下内容作为提示:

diff --suppress-common-lines Sample1.txt Sample2.txt  | awk 'BEGIN {print "Unique in Sample1.txt";} /</{print $2;}'

答案 1 :(得分:1)

我尝试使用bash数组和join存储文件列表并循环以获取文件中所有唯一性概率

#!/bin/bash

# List of files, can be modified as needed, can be any number of files
# The logic will work even if the files have a .txt extension, but 
# the final output file names will look odd

filelist=(file1 file2 file3 file4) 

# 'for' loop logic added to get the unique entries in each of the following combinations and in each of the files

# file1 file2
# file1 file3
# file1 file4
# file2 file3
# file2 file4
# file3 file4

# Outer for loop
for (( i=0; i<${#filelist[@]} ; i+=1 )) ; do
    # Inner for loop
    for (( j=i+1; j<${#filelist[@]} ; j+=1 )) ; do

    echo "Unique between ${filelist[i]}" "${filelist[j]}" > unique${filelist[i]}${filelist[j]}.txt

    echo -e "Unique in ${filelist[i]}"  >> unique${filelist[i]}${filelist[j]}.txt

    # Will produce unique lines in 'file i' when comparing 'file i' and 'file j'
    join -v 1 <(sort ${filelist[i]}) <(sort ${filelist[j]}) >> unique${filelist[i]}${filelist[j]}.txt

    echo -e "Unique in  ${filelist[j]}" >> unique${filelist[i]}${filelist[j]}.txt

    # Will produce unique lines in 'file j' when comparing 'file i' and 'file j'
    join -v 2 <(sort ${filelist[i]}) <(sort ${filelist[j]}) >> unique${filelist[i]}${filelist[j]}.txt

    done

done

将输出文件如下

$ ls unique*
uniquefile1file2.txt  uniquefile1file3.txt  uniquefile1file4.txt  uniquefile2file3.txt  uniquefile2file4.txt  uniquefile3file4.txt

在每个文件中内容如下

$ cat uniquefile1file2.txt
Unique between file1 file2
Unique in file1
002,20160512
003,20160512
Unique in  file2
004,20160512
006,20160512