比较多列,只有匹配才能替换

时间:2017-11-05 22:47:03

标签: awk

  • 我有两个文件(文件1和文件2)
  • 我试图将File1的Column1和2的字符串与File4的Column4和5进行比较。除了这个匹配,File2的column6也需要匹配某些字符串,如SO或CO(因为FILE1的column3和4分别是SO和CO),然后将FILE2的column7替换为FILE1的column3,否则保持其他不变。< / p>

  • 我尝试修改并使用论坛中提供的解决方案来解决类似的问题,但是没有用。

    FILE1
    type  code     SO  CO other
    
    7757    1       6941.958        138.922 149.17
    7757    2       8666.123        198.908 225.67
    7757    4       2795.885        334.875 378.68
    7759    GT3     222.104    13.5    734.62
    7768    CT2     0       0       0
    7805    6       3796.677        75.175  79.09 
    
    FILE2
    "US","01073",,"7757","1","SO","10","299"
    "US","01073",,"7758","1","SO","10","299"
    "US","01073",,"7757","1","NO","10","299"
    "US","01073",,"7757","1","CO","10","299"
    "US","01073",,"7757","4","MO","10","299"
    "US","01073",,"7757","1","GO","10","299"
    "US","01073",,"7805","6","CO","10","299"
    
    Required output:
    "US","01073",,"7757","1","SO","6941.958","299"
    "US","01073",,"7758","1","SO","10","299"
    "US","01073",,"7757","1","NO","10","299"
    "US","01073",,"7757","1","CO","138.922","299"
    "US","01073",,"7757","4","MO","10","299"
    "US","01073",,"7757","1","GO","10","299"
    "US","01073",,"7805","6","CO","75.175","299"
    

    我试过的解决方案(仅适用于CO):

    tr -d '"' < FILE2 > temp  # to remove double quote
    awk 'NR==FNR{A[$1,$2]=$3;next} A[$4,$5] && $6=="CO" {$7=A[$1,$2]; print}' FS=" " OFS="," FILE1 temp > out
    

1 个答案:

答案 0 :(得分:2)

复杂的 awk 解决方案:

awk 'function unquote(f){ 
         return substr(f, 2, length(f)-2) 
     }
     NR==FNR{ 
         if (NR==1){ f3=$3; f4=$4 }
         else if (NF){ a[$1,$2,f3]=$3; a[$1,$2,f4]=$4 }
         next; 
     }
     { k=unquote($4) SUBSEP unquote($5) SUBSEP unquote($6) }
     k in a{ $7=a[k] }1' file1 FS=',' OFS=',' file2
  • function unquote(f) { ... } - 在双引号之间取消引用/提取值(事实上 - 在字符串的第1个和最后一个字符之间)

  • a[$1,$2,f3]=$3; a[$1,$2,f4]=$4 - 对关键序列进行分组

输出:

"US","01073",,"7757","1","SO",6941.958,"299"
"US","01073",,"7758","1","SO","10","299"
"US","01073",,"7757","1","NO","10","299"
"US","01073",,"7757","1","CO",138.922,"299"
"US","01073",,"7757","4","MO","10","299"
"US","01073",,"7757","1","GO","10","299"
"US","01073",,"7805","6","CO",75.175,"299"