使用掩码获取唯一的数组值

时间:2014-05-23 07:21:12

标签: arrays linux macos bash sorting

我有一系列网址和相应的域名 数组值的分隔符为\n
域/ URL分隔符是逗号

site1.com,www.site1.com/blahA-blahB-blahC   
site2.com,site2.com/blahD-blahE-blahF   
site2.com,site2.com/blahG-blahH-blahI   
site3.com,site3.com/blahJ-blahK-blahL

我想过滤此数组并删除包含域重复项的行(第一次出现)。所需的输出如下:

site1.com,www.site1.com/blahA-blahB-blahC   
site2.com,site2.com/blahD-blahE-blahF   
site3.com,site3.com/blahJ-blahK-blahL

请指教。

1 个答案:

答案 0 :(得分:0)

试试这个awk命令,

awk -F/ '!x[$1]++' file

输出:

site1.com,www.site1.com/blahA-blahB-blahC
site2.com,site2.com/blahD-blahE-blahF
site3.com,site3.com/blahJ-blahK-blahL