如何查找包含重复小写字词的所有行。
我希望能够使用egrep
执行此操作,这是我到目前为止所尝试的但是我一直收到无效的反向引用:
egrep '\<(.)\>\1' inputFile.txt
egrep -w '\b(\w)\b\1' inputFile.txt
例如,如果我有以下文件:
The sky was grey.
The fall term went on and on.
I hope every one has a very very happy holiday.
My heart is blue.
I like you too too too much
I love daisies.
它应该在文件中找到以下行:
The fall term went on and on.
I hope every one has a very very happy holiday.
I like you too too too much
它会找到这些行,因为on
,very
和too
这两个词在每行中出现不止一次。
答案 0 :(得分:1)
知道了,你需要找出重复的单词(全部为低位)
sed -n '/\s\([a-z]*\)\s.*\1/p' infile
\1
是sed中的功能,但不确定grep / egrep是否也具有此功能。
答案 1 :(得分:1)
这可以通过-E
或-P
参数实现。
grep -E '(\b[a-z]+\b).*\b\1\b' file
示例:
$ cat file
The fall term went on and on.
I hope every one has a very very happy holiday.
Hi foo bar.
$ grep -E '(\b[a-z]+\b).*\b\1\b' file
The fall term went on and on.
I hope every one has a very very happy holiday.
答案 2 :(得分:1)
我知道这是关于grep
,但这里是awk
它会更灵活,因为您可以轻松更改为计数器c
c==2
两个相同的词
c>2
两个或更多等于单词
等
awk -F"[ \t.,]" '{c=0;for (i=1;i<=NF;i++) a[$i]++; for (i in a) c=c<a[i]?a[i]:c;delete a} c==2' file
The fall term went on and on.
I hope every one has a very very happy holiday.
它通过一行中的所有单词运行一个循环,并为每个单词创建一个数组索引 然后是一个新的循环,看看是否有重复的单词。
答案 3 :(得分:0)
试
egrep '[a-z]*' my_file
这将在每行中找到所有小写字符
egrep '[a-z]*' --color my_file
这将为较低的字符着色