我有一个文件'data.csv',其中有数千行需要根据文本文件'blacklist.txt'进行过滤,该文件也有数千行。
如果data.csv中的一行是blacklist.txt中任何一行的部分匹配,则应将其删除。
结果应保存在新的csv文件'data-filtered.csv'
中以下是data.csv的一些示例行:
"apple","orange","banana","","","","fruit"
"pork","beef","chicken","turkey","shrimp","fish","meat"
"green beans","peas","carrots","lettuce","","","veggies"
"milk","cheese","yogurt","sour cream","","","dairy"
来自blacklist.txt的示例数据:
meat
yogurt
我想针对blacklist.txt过滤data.csv,所以只有这些行会被添加到新的csv文件'data-filtered.csv'中,如下所示:
"apple","orange","banana","","","","fruit"
"green beans","peas","carrots","lettuce","","","veggies"
我尝试过使用grep但无法使用它,这是我尝试的一个命令:grep -v blacklist.txt data.csv>数据filtered.csv
生成的文件包含data.csv中的所有原始行,并且没有任何内容被过滤。
答案 0 :(得分:0)
这是否接近:
grep -vFf blacklist.txt data.csv > data-filtered.csv
答案 1 :(得分:0)
这是您需要避免部分匹配的内容,例如apple
匹配pineapple
:
$ awk 'NR==FNR{bl=(NR>1 ? bl "|" : "") "\""$0"\""; next} !($0 ~ bl)' blacklist.txt data.csv
查找
$ cat data.csv
"pineapple","orange","banana","","","","fruit"
"pork","beef","chicken","turkey","shrimp","fish","meat"
"green beans","peas","carrots","lettuce","","","veggies"
"milk","cheese","yogurt","sour cream","","","dairy"
$ cat blacklist.txt
apple
meat
yogurt
$ awk 'NR==FNR{bl=(NR>1 ? bl "|" : "") "\""$0"\""; next} !($0 ~ bl)' blacklist.txt data.csv
"pineapple","orange","banana","","","","fruit"
"green beans","peas","carrots","lettuce","","","veggies"
$ grep -vFf blacklist.txt data.csv
"green beans","peas","carrots","lettuce","","","veggies"