我有两个csv文件old.csv和new.csv。我只需要来自new.csv文件的新记录或更新记录。如果old.csv中存在记录,则从new.csv中删除记录。
old.csv
"R","abc","london","1234567"
"S","def","london","1234567"
"T","kevin","boston","9876"
"U","krish","canada","1234567"
new.csv
"R","abc","london","5678"
"S","def","london","1234567"
"T","kevin","boston","9876"
"V","Bell","tokyo","2222"
new.csv中的输出
"R","abc","london","5678"
"V","Bell","tokyo","2222"
注意:如果new.csv中的所有记录都相同,那么new.csv应为空
答案 0 :(得分:4)
例如使用grep
:
$ grep -v -f old.csv new.csv # > the_new_new.csv
"R","abc","london","5678"
"V","Bell","tokyo","2222"
和
$ grep -v -f old.csv old.csv
$ # see, no differencies in 2 identical files
man grep
:
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file
contains zero patterns, and therefore matches nothing. (-f is
specified by POSIX.)
-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v
is specified by POSIX.)
然后,您可以使用awk:
$ awk 'NR==FNR{a[$0];next} !($0 in a)' old.csv new.csv
"R","abc","london","5678"
"V","Bell","tokyo","2222"
说明:
awk '
NR==FNR{ # the records in the first file are hashed to memory
a[$0]
next
}
!($0 in a) # the records which are not found in the hash are printed
' old.csv new.csv # > the_new_new.csv
答案 1 :(得分:0)
文件排序时:
comm -13 old.csv new.csv
如果未对它们进行排序,则允许排序:
comm -13 <(sort old.csv) <(sort new.csv)