Linux CSV根据旧日期删除重复项

时间:2014-11-11 15:29:52

标签: linux

我们有以下CSV文件,其中包含

DCR_Path,翻译方向,日期和时间时间

data1,Send for Translation To CTM,Sep 30 2014 03:22

data2,Send for Translation To CTM,Sep 30 2014 02:21

data1,Send for Translation To CTM,Sep 30 2014 03:23

data1,Send for Translation To CTM,Sep 30 2013 03:24

data3,Send for Translation To CTM,Sep 30 2014 03:10

data2,Send for Translation To CTM,Sep 30 2014 02:22

data1,Send for Translation To CTM,Sep 30 2014 02:20

我需要采取最新的并删除其他副本,输出应该是:

DCR_Path,Direction for Translation,Date & Time

data1,Send for Translation To CTM,Sep 30 2014 03:23

data2,Send for Translation To CTM,Sep 30 2014 02:22

data3,Send for Translation To CTM,Sep 30 2014 03:10

我尝试了下面的命令,但它没有根据旧日期删除数据。

sort -u -t, -k1,2 filename.txt

根据旧日期删除重复数据并保持最新状态的任何帮助。

1 个答案:

答案 0 :(得分:0)

请替换_YOUR_FILE _...

awk -F ',' '{ if (Z) { "(date --date=\""$3"\" +\"%s\")" | getline X ; if (Y[$1] < X) { Y[$1] = X; C[$1] = $0 } } else { Z = $0 } } END { print Z ; for (V in C) { print C[V] } }' < _YOUR_FILE_