我有一个包含以下内容的输入文件:
123,apple,orange
123,pineapple,strawberry
543,grapes,orange
790,strawberry,apple
870,peach,grape
543,almond,tomato
123,orange,apple
我希望输出为: 重复以下数字: 123 543
有没有办法使用awk获取此输出;我在solaris中编写脚本,bash
答案 0 :(得分:2)
sed -e 's/,/ , /g' <filename> | awk '{print $1}' | sort | uniq -d
答案 1 :(得分:1)
如果你可以没有awk,你可以使用它来获得重复的数字:
cut -d, -f 1 my_file.txt | sort | uniq -d
打印
123
543
修改(以回应您的评论)
您可以缓冲输出并决定是否要继续。例如:
out=$(cut -d, -f 1 a.txt | sort | uniq -d | tr '\n' ' ')
if [[ -n $out ]] ; then
echo "The following numbers are repeated: $out"
exit
fi
# continue...
答案 2 :(得分:1)
此脚本将仅打印重复多次的第一列的编号:
awk -F, '{a[$1]++}END{printf "The following numbers are repeated: ";for (i in a) if (a[i]>1) printf "%s ",i; print ""}' file
或者缩短形式:
awk -F, 'BEGIN{printf "Repeated "}(a[$1]++ == 1){printf "%s ", $1}END{print ""} ' file
如果要在找到dup的情况下退出脚本,则可以退出非零退出代码。例如:
awk -F, 'a[$1]++==1{dup=1}END{if (dup) {printf "The following numbers are repeated: ";for (i in a) if (a[i]>1) printf "%s ",i; print "";exit(1)}}' file
在您的主脚本中,您可以:
awk -F, 'a[$1]++==1{dup=1}END{if (dup) {printf "The following numbers are repeated: ";for (i in a) if (a[i]>1) printf "%s ",i; print "";exit(-1)}}' file || exit -1
或者以更易读的格式:
awk -F, '
a[$1]++==1{
dup=1
}
END{
if (dup) {
printf "The following numbers are repeated: ";
for (i in a)
if (a[i]>1)
printf "%s ",i;
print "";
exit(-1)
}
}
' file || exit -1
答案 3 :(得分:1)
awk -vFS=',' \
'{KEY=$1;if (KEY in KEYS) { DUPS[KEY]; }; KEYS[KEY]; } \
END{print "Repeated Keys:"; for (i in DUPS){print i} }' \
< yourfile
还有sort / uniq / cut的解决方案(见上文)。