awk删除匹配字符串的行

时间:2014-12-10 18:16:35

标签: awk

我正在编写一个脚本来删除包含重复字符串的行。例如:

Epoch Time: 1418027874.795328000 seconds
Data: 4f67675300020000000000000000a6e0d
Epoch Time: 1418027874.807941000 seconds
Data: 4f676753000040caa20641080000a6e
Epoch Time: 1418027874.968753000 seconds
Data: 4f676753000080caa20641080000a6e0d4e40
Epoch Time: 1418027875.131557000 seconds
Epoch Time: 1418027875.131557012 seconds
Data: 4f676753000080caa206410870000a6e0d4e40

我想删除在第7行重复两次的另一个纪元时间的实例。

2 个答案:

答案 0 :(得分:1)

这是你期望的吗?

$ awk -F'[ .]' '!epochs[$3]++' file

<强>输出

Epoch Time: 1418027874.795328000 seconds
Data: 4f67675300020000000000000000a6e0d
Epoch Time: 1418027875.131557000 seconds

答案 1 :(得分:0)

如果您的数据文件中有相同的纪元(而您没有),则以下内容可以很好地执行

awk 'NF==4{if($3 in e)next;e[$3]} 1' your.data

测试

% cat ep.dat
Epoch Time: 1418027874.795328000 seconds
Data: 4f67675300020000000000000000a6e0d
Epoch Time: 1418027874.807941000 seconds
Data: 4f676753000040caa20641080000a6e
Epoch Time: 1418027874.968753000 seconds
Data: 4f676753000080caa20641080000a6e0d4e40
Epoch Time: 1418027875.131557000 seconds
Epoch Time: 1418027875.131557000 seconds
Data: 4f676753000080caa206410870000a6e0d4e40
% gawk 'NF==4{if($3 in e)next;e[$3]}1' ep.dat
Epoch Time: 1418027874.795328000 seconds
Data: 4f67675300020000000000000000a6e0d
Epoch Time: 1418027874.807941000 seconds
Data: 4f676753000040caa20641080000a6e
Epoch Time: 1418027874.968753000 seconds
Data: 4f676753000080caa20641080000a6e0d4e40
Epoch Time: 1418027875.131557000 seconds
Data: 4f676753000080caa206410870000a6e0d4e40
% mawk 'NF==4{if($3 in e)next;e[$3]}1' ep.dat
Epoch Time: 1418027874.795328000 seconds
Data: 4f67675300020000000000000000a6e0d
Epoch Time: 1418027874.807941000 seconds
Data: 4f676753000040caa20641080000a6e
Epoch Time: 1418027874.968753000 seconds
Data: 4f676753000080caa20641080000a6e0d4e40
Epoch Time: 1418027875.131557000 seconds
Data: 4f676753000080caa206410870000a6e0d4e40
% 

注意我已将数据文件编辑为具有两个相等的纪元时间