我有一系列网址和相应的域名
数组值的分隔符为\n
域/ URL分隔符是逗号
site1.com,www.site1.com/blahA-blahB-blahC
site2.com,site2.com/blahD-blahE-blahF
site2.com,site2.com/blahG-blahH-blahI
site3.com,site3.com/blahJ-blahK-blahL
我想过滤此数组并删除包含域重复项的行(第一次出现)。所需的输出如下:
site1.com,www.site1.com/blahA-blahB-blahC
site2.com,site2.com/blahD-blahE-blahF
site3.com,site3.com/blahJ-blahK-blahL
请指教。
答案 0 :(得分:0)
试试这个awk
命令,
awk -F/ '!x[$1]++' file
输出:
site1.com,www.site1.com/blahA-blahB-blahC
site2.com,site2.com/blahD-blahE-blahF
site3.com,site3.com/blahJ-blahK-blahL