使用awk从file2到File1的所有4列在csv上使用vlookup

时间:2018-11-17 10:25:23

标签: unix awk ksh

尝试在大型csv文件数据集中的4列上进行vlookup

F1:文件1

TSM,TYPE,NODE,SCHED
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5

F2

AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE4,DAILY_2200_VM_SDC-CTL-PROD5,NA

F1上所需结果17:输入文件1

TSM,TYPE,NODE,SCHED,2018-11-17
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA

在18号F2执行代码后所需的结果:输入File1

TSM,TYPE,NODE,SCHED,2018-11-17,2018-11-18
AIXTSM1,VHOST,10.199.114.72,DAILY_1800_VM_SDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,ADMET007,DAILY_1800_VM_SDC-CTL-PROD3,NO,NO
AIXTSM2,VHOST,ADMET014,DAILY_1900_VM_UDC-CTL-PROD,YES,YES
AIXTSM1,VHOST,AGGREGATE,DAILY_2200_VM_SDC-CTL-PROD5,NA,NA

代码

awk -F, -v date=$(date +'%Y-%m-%d') ' BEGIN   { OFS = FS } FNR==NR { a[$1] = $5; next } FNR==1  { n1 = n = NF + 1; $n = date; print; next } { $n1 = ($1 in a) ? a[$1] : "NA"; print }' f2 f1 > t && mv -f t f1

上面的代码结果不正确

1 个答案:

答案 0 :(得分:1)

看起来您的预期输出中显示的最后一行看起来并没有遵循您显示的规则,请您试一试。

awk -F, -v DAT=$(date +'%Y-%m-%d') '
FNR!=NR && FNR==1{
  print $0","DAT
  next
}
FNR==NR{
  a[$1,$2,$3,$4]=$0
  next
}
{
  $0=(($1,$2,$3,$4) in a)?a[$1,$2,$3,$4]:$0"," $NF ",NA"
}
1
'  Input_file2   Input_file1

如果要将输出保存到Input_file本身,请附加> temp_file && mv temp_file Input_file