我是2个csv文件,看起来像这样:
id, name, job
1, bob, fireman
3, alice, nurse
7, peter, policeman
...
和
id, name, age
2, john, 26
4, craig, 32
5, mary, 45
6, lucy, 23
...
如您所见,它们都按ID排序,第一个csv中缺少的ID实际上位于第二个csv中。
是否可以通过命令行工具(例如awk
或类似的东西)将这两个csv合并为一个看起来像这样的?
id, name, job, age
1, bob, fireman,
2, john, , 26
3, alice, nurse,
4, craig, , 32
...
非常感谢你的帮助?
答案 0 :(得分:2)
这应该做:
awk -F, -v OFS=, 'FNR==NR && FNR>1 {a[$1]=$0;c++;next} FNR>1{$NF=" ,"$NF;a[$1]=$0;c++} END {print "id, name, job, age";for (i=1;i<=c;i++) print a[i]}' file1 file2
id, name, job, age
1, bob, fireman
2, john, , 26
3, alice, nurse
4, craig, , 32
5, mary, , 45
6, lucy, , 23
7, peter, policeman
工作原理:
awk -F, -v OFS=, ' # Set input and output Field separator to ","
FNR==NR && FNR>1 { # For first file except first record do:
a[$1]=$0 # Store records inn to array "a"
c++ # Increment "c" for every record
next} # Skip to next record
FNR>1 { # For second file except first record do:
$NF=" ,"$NF # Replace last record with an extra ","
a[$1]=$0 # Store records inn to array "a"
c++} # Increment "c" for every record
END { # When all file is read do:
print "id, name, job, age" # Print header
for (i=1;i<=c;i++) # Loop "c" times
print a[i]} # Print records
' file1 file2 # Read the files
在阅读多个文件时,通常会使用 FNR==NR
来区分哪个文件可以使用