Question

假设我有两个文本文件

档案1

    hello i am John
    and i live in Cairo

文件2

    hello i am Jogn 
    and i love in Cairo

我需要在两个文本之间列出不同的单词（不是空格或其他任何东西）以获得结果作为文件3，它将包含列表中的两个单词，如下所示

    file1     file2
    John      Jogn
    live      love

我怎么能这样做？

我试过了

    diff file1 file2

但是无法获得所需的结果

谢谢

Answer 1

使用wdiff命令。

如果您没有，请使用“wdiff”软件包，该软件包应该可以在系统的存储库中使用。

$ wdiff file1 file2
hello i am [-John-] {+Jogn+} 
and i [-live-] {+love+} in Cairo

如果你想要一个图形显示，meld程序做得很好（如果你还没有它，请安装“meld”包）。

如果您需要特定的输出格式，则需要编写脚本。一个好的开始可能是过滤每个输入文件以将每个单词放在一行（fmt -w 1是第一个近似值），然后对结果进行差异化。

Answer 2

使用awk：

awk '
    # BEGIN: print 1th & 2th args
    BEGIN{print ARGV[1], ARGV[2]}
    # if the current line is from "file1",
    # put line in the array "a" with the line number for key
    FNR==NR{a[NR]=$0}
    if current line is from "file2"
    FNR!=NR{
        # iterate over words of the current line
        for (i=1; i<=NF; i++) {
            # split a[key current line] array in array "arr"
            split(a[FNR], arr)
            # test if both file1 and file2 Nth element match
            if (arr[i] != $i) {
                print arr[i], $i
             }
          }
     }
' file1 file2

输出：

/tmp/l1 /tmp/l2
John Jogn
live love

两个文本文件之间不同

2 个答案: