删除与特定字符串/值匹配的行

时间:2017-09-28 20:25:17

标签: shell unix ksh

我有一个名为" Master_Data"的平面文件。以下行:(考虑Customer_Key是主键)

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"

我收到了名为" Daily_Data"的类似文件结构。我需要将这些行附加到" Master_Data"如果是新行,请提交文件。更新/删除现有行。例如,我收到了" Daily_Data"文件如下:

Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"

然后我的代码应该生成/修改" Master_Data"文件如下:

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose" 

到目前为止我已尝试过这个

sed -n '2,$p' /users/files/Daily_Data.csv >> /users/files/Master_Data.csv

但这只是复制Daily_Data中的数据并附加到Master_Data,如下所示:

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","New York"
"3","1003","Austin"
"4","1004","San Jose"

我应该使用什么/尝试以最佳方式消除行"3","1003","New York"

2 个答案:

答案 0 :(得分:0)

使用awk,你可以这样做:

awk -F, 'NR==FNR{a[$1]=$0; next} $1 in a{$0=a[$1]; delete a[$1]} 1;
END{for (i in a) print a[i]}' Daily_Data Master_Data

Customer_Key,Customer_ID,Location
"1","1001","Washington D.C"
"2","1002","Los Angeles"
"3","1003","Austin"
"4","1004","San Jose"

参考: Effective AWK Programming

答案 1 :(得分:0)

awk -F, 'NR == FNR {print; id[$1]; next} !($1 in id)' Daily_Data Master_Data
Customer_Key,Customer_ID,Location
"3","1003","Austin"
"4","1004","San Jose"
"1","1001","Washington D.C"
"2","1002","Los Angeles"

要对其进行排序,您可以

awk ... | { read -r header; echo "$header"; sort -t'"' -k2,2n; }

要将其保存回Master_Data,请执行以下操作之一:

awk ... > tmp && mv tmp Master_Data
awk ... | sponge Master_Data         # using `sponge` from `moreutils` package