为什么diff实用程序在结果文件中显示类似的文本?

时间:2016-04-19 07:15:59

标签: bash unix diff

我使用diff来查找两个文本文件之间的差异。它工作得很好但是,当我更改文本文件中的行顺序时,它会在结果文件中显示类似的文本。

这是file1.txt:

>gi17
AAAAAA
>gi30
BBBBBB
>gi40
CCCCCC
>gi92
DDDDDD
>gi50
EEEEEE
>gi81
FFFFFF

FILE2.TXT

>gi40
CCCCCC
>gi01
BBBBBB
>gi02
AAAAAA
>gi30
BBBBBB

的Result.txt:

>gi17
AAAAAA
>gi30        ???
BBBBBB       ???
>gi92
DDDDDD
>gi01
BBBBBB
>gi50
EEEEEE
>gi81
FFFFFF
>gi02
AAAAAA
>gi30        ???
BBBBBB       ???

Diff语句:

$ diff C:/Users/User/Desktop/File1.txt C:/Users/User/Desktop/File2.txt > C:/Users/User/Desktop/Result.txt

为什么显示

>gi30
BBBBBB 

作为不同的人?

编辑1: 我想要的是在整个文件2中搜索文件1中每一行的出现,因为这两个文件没有被排序,我无法触摸它们(遗传数据)。

编辑2: 我想从我的PHP代码执行join命令。它在cygwin cmd应用程序中成功运行但是,它没有从我的php

运行
shell_exec("C:\\cygwin64\\bin\\bash.exe --login -c 'join -v 1 <(sort $OldDatabaseFile.txt) <(sort $NewDatabaseFile.txt) > $text_files_path/DelSeqGi.txt 2>&1'");

感谢。

2 个答案:

答案 0 :(得分:0)

正如fedorqui在评论中所说,差异比较文件逐行

要实现您的目标,您可以:

comm -3 <(sort f1.txt) <(sort f2.txt) > result.txt

手册(相关部分):

comm - compare two sorted files line by line

       -1     suppress column 1 (lines unique to FILE1)

       -2     suppress column 2 (lines unique to FILE2)

       -3     suppress column 3 (lines that appear in both files)


EXAMPLES
  comm -3 file1 file2
    Print lines in file1 not in file2, and vice versa.

答案 1 :(得分:0)

要获取文件之间的差异,请使用bash join util,如下所示: -

DESCRIPTION
     The join utility performs an ``equality join'' on the specified files and
     writes the result to the standard output.  The ``join field'' is the
     field in each file by which the files are compared.  The first field in
     each line is used by default.  There is one line in the output for each
     pair of lines in file1 and file2 which have identical join fields.  Each
     output line consists of the join field, the remaining fields from file1
     and then the remaining fields from file2.

 -v file_number
         Do not display the default output, but display a line for each
         unpairable line in file file_number.  The options -v 1 and -v 2
         may be specified at the same time.

 -1 field
         Join on the field'th field of file1.

 -2 field
         Join on the field'th field of file2.

join -v 1 <(sort file1.txt) <(sort file2.txt)     # To get the lines in file file1.txt which file file2.txt does not have
join -v 2 <(sort file1.txt) <(sort file2.txt)     # Vice Versa of above

原始答案/致谢: - https://stackoverflow.com/a/4544780/5291015