如何查找文件中多次出现(ny)个单词的行?

时间:2015-01-23 05:12:07

标签: regex grep field multiple-columns custom-fields

我想找到多次出现(ny)字的行。例如,如果输入文本是

John is a teacher, who is not highly paid.
abc abcde
James lives in Detroit.
abc abc abcde
Paul has 2 dogs and 2 cats.

输出应为

John is a teacher, who is not highly paid.
abc abc abcde
Paul has 2 dogs and 2 cats.

第一行重复is,第二行重复abc,最后一行重复2

2 个答案:

答案 0 :(得分:3)

^(?=.*\b(\w+)\b.*\b\1\b).*$

试试这个。看看演示。

https://www.regex101.com/r/rG7gX4/6

grep -P

一起使用

答案 1 :(得分:0)

这是在awk

中执行此操作的简单方法
awk '{f=0;delete a;for (i=1;i<=NF;i++) if (a[$i]++) f=1} f' file
John is a teacher, who is not highly paid.
abc abc abcde
Paul has 2 dogs and 2 cats.

它循环遍历每个单词并在数组a中计算它们 如果找到多个单词,请设置标记f
如果标志f为真,则执行默认操作,打印行。


看多少:

awk '{f=0;delete a;for (i=1;i<=NF;i++) if (a[$i]++) f=1} f {for (i in a) if (a[i]>1) printf "%sx\"%s\"-",a[i],i;print $0}' file
2x"is"-John is a teacher, who is not highly paid.
2x"abc"-abc abc abcde
2x"2"-Paul has 2 dogs and 2 cats.

一些改进:忽略大小写。移除.,

awk '{f=0;delete a;for (i=1;i<=NF;i++) {w=tolower($i);sub(/[.,]/,"",w);if (a[w]++) f=1}} f' file