基本上(稍微说一下)我有一个csv文件,格式如下:
"ID","Name","Phone Number"
"00001","Ricky Stallman","07771111111"
"00003","Harrison Ford","07701010101"
"00003","Harrison Ford",""
"00008","Bob Geldof","07712121212"
“哈里森福特”条目再次出现在我的csv中,旁边没有数字(这只是数据令我烦恼的方式)。我需要csv这样读(即将上面一行中的数字复制到下面的字段中):
"ID","Name","Phone Number"
"00001","Ricky Stallman","07771111111"
"00003","Harrison Ford","07701010101"
"00003","Harrison Ford","07701010101"
"00008","Bob Geldof","07712121212"
最好在Bash中是否有人有建议?
答案 0 :(得分:3)
试试这个:
awk -F',' '$3!~/""/{nbr=$3} {print $1","$2","nbr}' file
如果第三列是“”,则使用最后一个有效值。
答案 1 :(得分:2)
可以使用gawk解决方案:
#!/usr/bin/gawk -f
match($0, /"([^\"]*)".*,"([^"]*)","([^"]*)"/, t) {
key = t[1] "|" t[2] ## Or just key = t[2] to be less strict.
if (!(t[3] == "" && key in a)) {
a[key] = t[3]
}
printf "\"%s\",\"%s\",\"\"%s\"\n", t[1], t[2], a[key]
}
冷凝的:
gawk 'match($0, /"([^\"]*)".*,"([^"]*)","([^"]*)"/, t) { key = t[1] "|" t[2]; if (!(t[3] == "" && key in a)) a[key] = t[3]; printf "\"%s\",\"%s\",\"%s\"\n", t[1], t[2], a[key] }' file
输出:
"ID","Name","Phone Number"
"00001","Ricky Stallman","07771111111"
"00003","Harrison Ford","07701010101"
"00003","Harrison Ford","07701010101"
"00008","Bob Geldof","07712121212"