Question

我有两个输入文件（制表符分隔），我需要找到它们之间的$ 1＆amp;＆amp;如果仅匹配第3和第4个字段将向下移动，则$ 2：

INPUT：文件1：

 p1   555  
 p1   557  
 p3   558

文件2：

p1  323 lololo  aaaa    
p1  555 papapp  kkka    
p1  556 hooho   sssa    
p1  557 jjjlo   kkka    
p3  424 zzzzz   llla    
p3  558 jjjjj   ssss

输出：

p1 323  lololo aaaa
p1 555
p1 556  papaapp kkka
p1 557   
p3 424  hooho   sssa
p3 558      
        jjjlo   kkka

等。

谢谢

Answer 1

这些方面应该有效：

awk 'NR == FNR { to_shift[$1,$2] = 1; next } { queue[++w] = $3 OFS $4 } to_shift[$1, $2] { print $1, $2; next } { print $1, $2, queue[++r] } END { while(r != w) { print OFS OFS queue[++r] } }' file1 file2

那是：

NR == FNR {                      # while processing the first file (file1)
  to_shift[$1,$2] = 1            # remember which lines to shift
  next                           # and do nothing else
}
{                                # afterwards (processing file2):
  queue[++w] = $3 OFS $4         # queue the next payload fields
}
to_shift[$1, $2] {               # If this is a shift line
  print $1, $2                   # print only the first two fields
  next                           # and do nothing else
}
{                                # otherwise, print the first two fields and
  print $1, $2, queue[++r]       # the next queued payload
}
END {                            # In the end:
  while(r != w) {                # print out what remains in the queue, i.e.
    print OFS OFS queue[++r]     # all that was shifted out at the bottom
  }
}

我怀疑对于格式化，您可能希望使用\t作为输出字段分隔符，在这种情况下，您只需将-v OFS='\t'传递给awk：

awk -v OFS='\t' 'NR == FNR { to_shift[$1,$2] = 1; next } { queue[++w] = $3 OFS $4 } to_shift[$1, $2] { print $1, $2; next } { print $1, $2, queue[++r] } END { while(r != w) { print OFS OFS queue[++r] } }' file1 file2

如果输入是以制表符分隔的，并且字段可以包含空格，那么也可以传递-F '\t'以使输入字段分隔符成为选项卡。

比较两个文件，如果匹配向下移动最后一个字段（awk）

1 个答案: