匹配两个文件中的值并替换特定列中的值

时间:2019-02-20 05:12:52

标签: awk

目的是检查file1中第2列和第3列的值是否与file2中第1列匹配。如果有任何值匹配,则使用文件1第4列和第5列的信息替换文件2中第2列和第3列的值。

文件1

100,31431,37131,999991.70,2334362.30
100,31431,37471,111113.20,2334363.30
100,31433,36769,777775.60,2334361.90
102,31433,36853,333322.00,2334362.80

文件2

3143137113 318512.50 2334387.50 100
3143137131 318737.50 2334387.50 100
3143137201 319612.50 2334387.50 100
3143137219 319837.50 2334387.50 100
3143137471 322987.50 2334387.50 100
3143137491 323237.50 2334387.50 100
3143336687 313187.50 2334412.50 100
3143336723 313637.50 2334412.50 100
3143336769 314212.50 2334412.50 100
3143336825 314912.50 2334412.50 100
3143336853 315262.50 2334412.50 102

所需的输出

31431,37113,318512.50,2334387.50,100
31431,37131,999991.70,2334362.30,100
31431,37201,319612.50,2334387.50,100
31431,37219,319837.50,2334387.50,100
31431,37471,111113.20,2334363.30,100
31431,37491,323237.50,2334387.50,100
31433,36687,313187.50,2334412.50,100
31433,36723,313637.50,2334412.50,100
31433,36769,777775.60,2334361.90,100
31433,36825,314912.50,2334412.50,100
31433,36853,333322.00,2334362.80,102

我尝试过

awk -F[, ] 'FNR==NR{a[$1 $2]=$0;next}$1 in a{print $0 ,a[$1 $2]}' file1 file2

预先感谢

2 个答案:

答案 0 :(得分:2)

请您尝试以下。

<RoiCalculator business_size={30} {...otherProps}/>

输出如下。

awk '
BEGIN{
  OFS=","
}
FNR==NR{
  a[$2 $3]=$2 OFS $3
  b[$2 $3]=$4;c[$2 $3]=$5
  next
}
($1 in a){
  $2=b[$1]
  $3=c[$1];$1=a[$1]
  print
  next
}
{
  $1=$1
  sub(/^...../,"&,",$1)
  print
}
' FS=","   file1  FS=" " file2

答案 1 :(得分:1)

尝试一下:

$ awk -F, 'NR==FNR{tmp=$0;sub($1 FS,"",tmp);a[$2 $3]=tmp;next} $1 in a{print a[$1],$NF;next} {$1=substr($1,1,5) OFS substr($1,6,5);} 1' OFS=, file1 FS=' ' file2
31431,37113,318512.50,2334387.50,100
31431,37131,999991.70,2334362.30,100
31431,37201,319612.50,2334387.50,100
31431,37219,319837.50,2334387.50,100
31431,37471,111113.20,2334363.30,100
31431,37491,323237.50,2334387.50,100
31433,36687,313187.50,2334412.50,100
31433,36723,313637.50,2334412.50,100
31433,36769,777775.60,2334361.90,100
31433,36825,314912.50,2334412.50,100
31433,36853,333322.00,2334362.80,102

以上假设$1中的file不包含正则表达式字符,因此,为确保准确和安全,请更好地使用此字符:

awk -F, 'NR==FNR{$1="";a[$2 $3]=substr($0,2);next} $1 in a{print a[$1],$NF;next} {$1=substr($1,1,5) OFS substr($1,6,5);} 1' OFS=, file1 FS=' ' file2

但是,此假设FS的{​​{1}}仅是1个字符。

这会带来另一项更改/效率改进:

file1