Question

如何查找包含重复小写字词的所有行。我希望能够使用egrep执行此操作，这是我到目前为止所尝试的但是我一直收到无效的反向引用：

egrep '\<(.)\>\1' inputFile.txt
egrep -w '\b(\w)\b\1' inputFile.txt

例如，如果我有以下文件：

The sky was grey. 
The fall term went on and on.
I hope every one has a very very happy holiday.
My heart is blue.
I like you too too too much
I love daisies.

它应该在文件中找到以下行：

The fall term went on and on.
I hope every one has a very very happy holiday.
I like you too too too much

它会找到这些行，因为on，very和too这两个词在每行中出现不止一次。

Answer 1

知道了，你需要找出重复的单词（全部为低位）

sed -n '/\s\([a-z]*\)\s.*\1/p' infile

工具用于满足您的要求。限制一个工具是不好的方法。

\1是sed中的功能，但不确定grep / egrep是否也具有此功能。

Answer 2

这可以通过-E或-P参数实现。

grep -E '(\b[a-z]+\b).*\b\1\b' file

示例：

$ cat file
The fall term went on and on.
I hope every one has a very very happy holiday.
Hi foo bar.
$ grep -E '(\b[a-z]+\b).*\b\1\b' file
The fall term went on and on.
I hope every one has a very very happy holiday.

Answer 3

我知道这是关于grep，但这里是awk 它会更灵活，因为您可以轻松更改为计数器c c==2两个相同的词 c>2两个或更多等于单词
等

awk -F"[ \t.,]" '{c=0;for (i=1;i<=NF;i++) a[$i]++; for (i in a) c=c<a[i]?a[i]:c;delete a} c==2' file
The fall term went on and on.
I hope every one has a very very happy holiday.

它通过一行中的所有单词运行一个循环，并为每个单词创建一个数组索引然后是一个新的循环，看看是否有重复的单词。

Answer 4

试

egrep '[a-z]*' my_file

这将在每行中找到所有小写字符

 egrep '[a-z]*' --color my_file

这将为较低的字符着色

查找包含使用grep多次出现的单词的行

4 个答案:

工具用于满足您的要求。限制一个工具是不好的方法。