我有一系列ID,如下所示。
20140201,ZTE_GENERIC_959,ZTE_GENERIC_959,PREPAID,ZTE_GENERIC_959,0,0,0,0,0,0,0,-120,0,0,0,0,0,0,0,0
20140201,ZTE_GENERIC_959,ZTE_GENERIC_959,PREPAID,ZTE_GENERIC_959,-100,568,0,0,0,0,0,-25,0,0,0,0,0,0,0,0
20140201,ZTE_GENERIC_988,ZTE_GENERIC_988,PREPAID,ZTE_GENERIC_988,-9,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0
20140201,ZTE_GENERIC_1010,ZTE_GENERIC_1010,PREPAID,ZTE_GENERIC_1010,0,0,0,0,0,0,0,-141,0,0,0,0,0,0,0,0
20140201,ZTE_GENERIC_959,ZTE_GENERIC_959,PREPAID,ZTE_GENERIC_959,0,0,0,0,0,0,-79,-67,0,0,0,0,0,0,0,0
20140201,ZTE_GENERIC_959,ZTE_GENERIC_959,PREPAID,ZTE_GENERIC_959,0,0,0,0,0,0,-474,146,0,0,0,0,0,0,0,0
20140201,ZTE_GENERIC_1219,ZTE_GENERIC_1219,HYBRIDE,ZTE_GENERIC_1219,0,0,0,0,0,0,0,0,-210,137,0,0,0,0,0,0
20140201,ZTE_GENERIC_1010,ZTE_GENERIC_1010,PREPAID,ZTE_GENERIC_1010,-127.5,85,0,0,0,0,0,0,0,0,0,0,0,0,0,0
20140201,ZTE_GENERIC_988,ZTE_GENERIC_988,PREPAID,ZTE_GENERIC_988,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
20140201,ZTE_GENERIC_1081,ZTE_GENERIC_1081,PREPAID,ZTE_GENERIC_1081,-126.4,71,0,0,0,0,-63.2,11,0,0,0,0,0,0,0,0
20140201,ZTE_GENERIC_959,ZTE_GENERIC_2_ZTE_GENERIC_959,PREPAID,ZTE_GENERIC_959,0,0,0,0,0,0,0,-142,0,0,0,0,0,0,0,0
我正在寻找一个awk脚本来查找此列表中的副本。我使用的脚本只考虑第一列,因此输出错误。我想要比较至少3或4列,所以结果是正确的
答案 0 :(得分:0)
试试这个:
1)
awk 'a[$0]++' File
这将显示所有重复的行。
2)
awk '!a[$0]++' File
这将删除所有重复的行,如果这是你想要的。 这将检查整行......
我们使用计数器数组a
,其中entire line
为索引,并且第一次将计数增加1。下一次,条件将为false,因为与该行will not be zero
对应的计数将因此而失败,并且将忽略重复的行。
答案 1 :(得分:0)
首先,你的问题不明确。
请在三栏或四栏比较中进行。
如果需要对完整的行进行comapred,那么您已经拥有A M D的解决方案,但稍有改动。为字段分隔符-F,
如果是3栏:
awk -F, '!a[$1$2$3]' File
如果是4栏:
awk -F, '!a[$1$2$3$4]' File