我有两个文件。首先包含所有样品的名称,数字和天数 sam_name.csv
Number,Day,Sample
171386,0,38_171386_D0_2-1.raw
171386,0,38_171386_D0_2-2.raw
171386,2,30_171386_D2_1-1.raw
171386,2,30_171386_D2_1-2.raw
171386,-1,40_171386_D-1_1-1.raw
171386,-1,40_171386_D-1_1-2.raw
第二个包含有关批次的信息(最后一栏) sam_batch.csv
Number,Day,Quar,Code,M.F,Status,Batch
171386,0,1,x,F,C,1
171386,1,1,x,F,C,2
171386,2,1,x,F,C,5
171386,-1,1,x,F,C,6
我想获取有关批次的信息(使用两个条件编号和日期)并将其添加到第一个文件中。我使用awk命令来做到这一点,但我只在一次性点(-1)得到结果。
这是我的命令:
awk -F"," 'NR==FNR{number[$1]=$1;day[$1]=$2;batch[$1]=$7; next}{if($1==number[$1] && $2==day[$1]){print $0 "," number[$1] "," day[$1] "," batch[$1]}}' sam_batch.csv sam_nam.csv
以下是我的结果:(文件sam_name,文件sam_batch中的数字和日期(仅用于检查条件是否正常)和批号(我需要的值)
Number,Day,Sample,Number,Day, Batch
171386,-1,40_171386_D-1_1-1.raw,171386,-1,6
171386,-1,40_171386_D-1_1-2.raw,171386,-1,6
175618,-1,08_175618_D-1_1-1.raw,175618,-1,2
答案 0 :(得分:1)
Here I corrected your AWK code:
awk -F"," 'NR==FNR{
number_day = $1 FS $2
batch[number_day]=$7
next
}
{
number_day = $1 FS $2
print $0 "," batch[number_day]
}' sam_batch.csv sam_name.csv
Output:
Number,Day,Sample,Batch
171386,0,38_171386_D0_2-1.raw,1
171386,0,38_171386_D0_2-2.raw,1
171386,2,30_171386_D2_1-1.raw,5
171386,2,30_171386_D2_1-2.raw,5
171386,-1,40_171386_D-1_1-1.raw,6
171386,-1,40_171386_D-1_1-2.raw,6
(No need for double-checking if you understand how the script works.)
Here's another AWK solution (my original answer):
awk -v "b=sam_batch.csv" 'BEGIN {
FS=OFS=","
while(( getline line < b) > 0) {
n = split(line,a)
nd = a[1] FS a[2]
nd2b[nd] = a[n]
}
}
{ print $1,$2,$3,nd2b[$1 FS $2] }' sam_name.csv
Both solutions parse file sam_batch.csv
at the beginning to form a dictionary of (number, day) -> batch. Then they parse sam_name.csv
, printing out the first three fields together with the "Batch" from another file.