我有两个文件,我使用“comm -23 file1 file2”命令将不同文件的行提取到另一个文件。
我还需要提取不同行的东西,但也保留字符串“line_ $ NR”。 例: 文件1:
line_1: This is line0
line_2: This is line1
line_3: This is line2
line_4: This is line3
file2的:
line_1: This is line1
line_2: This is line2
line_3: This is line3
我需要这个输出: 差异file1 file2:
line_1: This is line0.
总之,我需要提取差异,就像文件开头没有line_ $ NR一样,但是当我打印结果时,我还需要打印line_ $ NR。
答案 0 :(得分:0)
尝试使用awk
awk -F: 'NR==FNR {a[$2]; next} !($2 in a)' file2 file1
输出:
line_1: This is line0
简短说明
awk -F: ' # Set filed separator as ':'. $1 contains line_<n> and $2 contains 'This is line_<m>'
NR==FNR { # If Number of records equal to relative number of records, i.e. first file is being parsed
a[$2]; # store $2 as a key in associative array 'a'
next # Don't process further. Go to next record.
}
!($2 in a) # Print a line if $2 of that line is not a key of array 'a'
' file2 file1
附加要求(在评论中)
如果我在一行中有多个“:”:“line_1:This:is:line0” 不起作用。我怎样才能只使用line_x
在这种情况下,请尝试关注(仅限 GNU awk )
awk -F'line_[0-9]+:' 'NR==FNR {a[$2]; next} !($2 in a)' file2 file1
答案 1 :(得分:0)
这个awk行更长,但无论差异位于何处,它都会起作用:
awk 'NR==FNR{a[$NF]=$0;next}a[$NF]{a[$NF]=0;next}7;END{for(x in a)if(a[x])print a[x]}' file1 file2
试验:
kent$ head f*
==> f1 <==
line_1: This is line0
line_2: This is line1
line_3: This is line2
line_4: This is line3
==> f2 <==
line_1: This is line1
line_2: This is line2
line_3: This is line3
#test f1 f2
kent$ awk 'NR==FNR{a[$NF]=$0;next}a[$NF]{a[$NF]=0;next}7;END{for(x in a)if(a[x])print a[x]}' f1 f2
line_1: This is line0
#test f2 f1:
kent$ awk 'NR==FNR{a[$NF]=$0;next}a[$NF]{a[$NF]=0;next}7;END{for(x in a)if(a[x])print a[x]}' f2 f1
line_1: This is line0