Question

我正在寻找一个简单的正则表达式，以匹配重复超过10次左右的相同字符。例如，如果我有一个散落着水平线的文档：

=================================================

它将匹配=个字符的行，因为它重复超过10次。注意我希望这适用于任何字符。

Answer 1

您需要的正则表达式是/(.)\1{9,}/。

测试：

#!perl
use warnings;
use strict;
my $regex = qr/(.)\1{9,}/;
print "NO" if "abcdefghijklmno" =~ $regex;
print "YES" if "------------------------" =~ $regex;
print "YES" if "========================" =~ $regex;

此处\1称为反向引用。它引用括号.之间的点(.)捕获的内容，然后{9,}请求九个或更多相同的字符。因此，这匹配任何单个字符中的十个或更多。

虽然上面的测试脚本是在Perl中，但这是非常标准的正则表达式语法，应该适用于任何语言。在某些变体中，您可能需要使用更多的反斜杠，例如Emacs会让你在这里写\(.\)\1\{9,\}。

如果整个字符串应包含9个或更多相同的字符，请在模式周围添加锚点：

my $regex = qr/^(.)\1{9,}$/;

Answer 2

在Python中，您可以使用(.)\1{9,}

（。）从一个char（任何字符）
\ 1 {9，}匹配第一组中的九个或更多字符

示例：

txt = """1. aaaaaaaaaaaaaaa
2. bb
3. cccccccccccccccccccc
4. dd
5. eeeeeeeeeeee"""
rx = re.compile(r'(.)\1{9,}')
lines = txt.split('\n')
for line in lines:
    rxx = rx.search(line)
    if rxx:
        print line

输出：

1. aaaaaaaaaaaaaaa
3. cccccccccccccccccccc
5. eeeeeeeeeeee

Answer 3

.匹配任何字符。与已经提到的花括号一起使用：

$: cat > test
========
============================
oo
ooooooooooooooooooooooo


$: grep -E '(.)\1{10}' test
============================
ooooooooooooooooooooooo

Answer 4

使用{10，}运算符：

$: cat > testre
============================
==
==============

$: grep -E '={10,}' testre
============================
==============

Answer 5

您还可以使用PowerShell to quickly replace words或角色。 PowerShell适用于Windows。当前版本是3.0。

$oldfile = "$env:windir\WindowsUpdate.log"

$newfile = "$env:temp\newfile.txt"
$text = (Get-Content -Path $oldfile -ReadCount 0) -join "`n"

$text -replace '/(.)\1{9,}/', ' ' | Set-Content -Path $newfile

Answer 6

在某些应用中，您需要删除斜杠才能使其正常工作。

App.setLocation(LocationServices.FusedLocationApi.getLastLocation(locationApiClient));

或者这个：

/(.)\1{9,}/

Answer 7

={10,}

匹配重复10次或更多次的=。

Answer 8

PHP的preg_replace示例：

$str = "motttherbb fffaaattther";
$str = preg_replace("/([a-z])\\1/", "", $str);
echo $str;

此处[a-z]命中该字符，()然后允许它与\\1反向引用一起使用，该反引用尝试匹配另一个相同的字符（注意这已经是针对2个连续字符），因此：

爸爸妈妈

如果你这样做了：

$str = preg_replace("/([a-z])\\1{2}/", "", $str);

将删除3个连续重复的字符，输出：

moherbb her

Answer 9

一个稍微通用的Powershell示例。在powershell 7中，匹配项将突出显示，包括最后一个空格（可以在堆栈中突出显示吗？）。

'a b c d e f ' | select-string '([a-f] ){6,}'

a b c d e f

正则表达式匹配任何重复超过10次的字符

9 个答案: