根据字段和长度删除重复的行

时间:2017-06-23 11:57:06

标签: linux awk sed grep

我需要根据括号之间的字段删除重复的行(例如:(265394673718132736)),但删除较短的行。

示例:

SERVER: 1 - (265394673718132736) - NO - ['OK', 'GROUP1']
SERVER: 2 - (284906813495967745) - NO - ['OK', 'GROUP1']
SERVER: 3 - (184387362225258496) - NO - ['OK', 'GROUP2']
SERVER: 4 - (118642771161645056) - NO - ['OK', 'GROUP1', 'SAR']
SERVER: 4 - (118642771161645056) - NO - ['OK', 'GROUP1']
SERVER: 5 - (234329090943877122) - NO - ['OK', 'GROUP4', 'SAR']
SERVER: 5 - (234329090943877122) - NO - ['OK', 'GROUP4', 'SAR', 'NO']
SERVER: 6 - (287039745190658069) - NO - ['OK', 'GROUP6']
SERVER: 7 - (280378736145072130) - NO - ['OK', 'GROUP3']

深思熟虑的结果:

SERVER: 1 - (265394673718132736) - NO - ['OK', 'GROUP1']
SERVER: 2 - (284906813495967745) - NO - ['OK', 'GROUP1']
SERVER: 3 - (184387362225258496) - NO - ['OK', 'GROUP2']
SERVER: 4 - (118642771161645056) - NO - ['OK', 'GROUP1', 'SAR']
SERVER: 5 - (234329090943877122) - NO - ['OK', 'GROUP4', 'SAR', 'NO']
SERVER: 6 - (287039745190658069) - NO - ['OK', 'GROUP6']
SERVER: 7 - (280378736145072130) - NO - ['OK', 'GROUP3']

编辑:

试过:

cat test | cut -f1 -d ":" --complement | sort -u -t'-' -k2,2

但我需要删除较短的一行,而不是随机的。

1 个答案:

答案 0 :(得分:2)

awk '{a[$4]=length(a[$4])<length?$0:a[$4]}END{for(x in a)print a[x]}' file

完成这项工作。

请注意,输出中的行顺序不会保留。