我有3个CSV文件:
基本文件(值初始化为0)
steve tignor ash michael jose sam joshua
0 0 0 0 0 0 0
文件1:
tignor michael jose
888 9 -2
文件2:
ash joshua
77 66
我需要的输出:
steve tignor ash michael jose sam joshua
File1 0 888 0 9 -2 0 0
File2 0 0 77 0 0 0 66
我尝试使用awk首先对文件进行排序,然后与粘贴合并,但是由于我有1000多个列并且有30个文件,所以它不起作用。
代码:
awk -F"," 'NR==1{
split($0,a,FS);asort(a);
for(i=1;i<=NF;i++)b[$i]=i
} {
for(i=1;i<=NF;i++)printf("%s,",$(b[a[i]]));
print x
}' File1 > 1.csv
awk -F"," 'NR==1{
split($0,a,FS);asort(a);
for(i=1;i<=NF;i++)b[$i]=i
} {
for(i=1;i<=NF;i++)printf("%s,",$(b[a[i]]));
print x
}' File2 > 2.csv
paste -d"\n" 1.csv 2.csv > merge.csv
这里需要一些帮助。预先感谢。
答案 0 :(得分:1)
我假设您省略了文件中的逗号。如果您使用空格分隔的文件,则只需更改分割功能中使用的分隔符即可。
awk '
ARGIND==1 && FNR==1{
split($0, base, ",")
printf("file,%s\n",$0)
}
ARGIND > 1 && FNR==1{
split($0, names, ",")
printf("%s", ARGV[ARGIND])
}
ARGIND > 1 && FNR==2{
split($0, values, ",")
for(i in names)
line[names[i]] = values[i]
for(i in base){
if(base[i] in line)
printf(",%s", line[base[i]])
else
printf(",0")
}
delete line
print ""
}
' base.csv file1.csv file2.csv
示例:
file1.csv:
tignor,michael,jose
888,9,-2
file2.csv:
ash,joshua
77,66
和base.csv:
steve,tignor,ash,michael,jose,sam,joshua
0,0,0,0,0,0,0
输出为:
file,steve,tignor,ash,michael,jose,sam,joshua
file1.csv,0,888,0,9,-2,0,0
file2.csv,0,0,77,0,0,0,66
基本上,脚本以两个步骤运行:
P.S。我制作了脚本的新POSIX awk兼容版本:
awk --posix '
NR==FNR && FNR==1{
split($0, base, ",")
printf("file,%s\n",$0)
}
NR>FNR && FNR==1{
split($0, names, ",")
printf("%s", FILENAME)
}
NR>FNR && FNR==2{
split($0, values, ",")
for(i in names)
line[names[i]] = values[i]
for(i in base){
if(base[i] in line)
printf(",%s", line[base[i]])
else
printf(",0")
}
delete line
print ""
}
' base.csv file1.csv file2.csv