Question

我希望在没有abc后跟一些字符（可能没有）并以.com结束时为字符串匹配。

我尝试了以下内容：

(?!abc).*\.com

或

(?!abc).*?\.com

或

(?<!abc).*\.com

或

(?<!abc).*?\.com

但这些都没有奏效。请帮忙。

非常感谢！

修改

对不起，如果我没有说清楚。举个例子。我希望def.edu，abc.com，abce.com，eabc.com和abcAnYTHing.com不匹配，而a.com，b.com，{{ 1}}，ab.com等匹配。

Answer 1

您的措辞中不清楚您是否要在此之前匹配以.com结尾且不包含abc的字符串;或者匹配一个没有“abc后跟字符后跟.com”的字符串。

意思是，在第一种情况下，"def.edu"不匹配（没有“abc”但不以“.com”结尾）但在第二种情况下"def.edu"匹配（因为它不是“ abcSOMETHING.com“）

在第一种情况下，您需要使用负面观察：

(?<!abc.+)\.com$
# Use .* instead of .+ if you want "abc.com" to fail as well

重要：使用look-behind的原始表达式 - ＃3（(?<!abc).*\.com） - 无法正常工作，因为只有后卫才会看到在下一学期之前立即。因此，“abc之后的东西”应与abc一起包含在后视中 - 正如我上面的RegEx所做的那样。

问题：我的RegEx可能对您的特定RegEx引擎无效，除非支持具有可变长度表达式的常规后台（如上所述） - 这些日子只有.NET（对于http://www.regular-expressions.info/lookaround.html有什么作用和不支持什么样的观察结果的一个很好的总结）。

如果情况确实如此，则必须进行双重匹配：首先，检查.com;捕捉它前面的一切;然后abc上的负面匹配。我将使用Perl语法，因为您没有指定语言：

if (/^(.*)\.com$/) {
    if ($1 !~ /abc/) { 
    # Or, you can just use a substring:
    # if (index($1, "abc") < 0) {
        # PROFIT!
    }
}

在第二种情况下，EASIEST要做的是做一个“不匹配”的操作符 - 例如Perl中的!~（如果您的语言不支持，则取消匹配的结果“不匹配”）。使用伪代码的示例：

if (NOT string.match(/abc.+\.com$/)) ...

请注意，使用负面后视时，您不需要“。+”/“。*”;

Answer 2

冷凝：

Sorry if I did not make myself clear. Just give some examples.
I want def.edu, abc.com, abce.com, eabc.com and
abcAnYTHing.com do not match,
while a.com, b.com, ab.com, ae.com etc. match.

修改OP示例后的新正则表达式：
/^(?:(?!abc.*\.com\$|^def\.edu\$).)+\.(?:com|edu)\$/s

use strict;
use warnings;


my @samples = qw/
 <newline>
   shouldn't_pass 
   def.edu  abc.com  abce.com eabc.com 
 <newline>
   should_pass.com
   a.com    b.com    ab.com   ae.com
   abc.edu  def.com  defa.edu
 /;

my $regex = qr
  /
    ^    # Begin string
      (?:  # Group    

          (?!              # Lookahead ASSERTION
                abc.*\.com$     # At any character position, cannot have these in front of us.
              | ^def\.edu$      # (or 'def.*\.edu$')
           )               # End ASSERTION

           .               # This character passes

      )+   # End group, do 1 or more times

      \.   # End of string check,
      (?:com|edu)   # must be a '.com' or '.edu' (remove if not needed)

    $    # End string
  /sx;


print "\nmatch using   /^(?:(?!abc.*\.com\$|^def\.edu\$).)+\.(?:com|edu)\$/s \n";

for  my $str ( @samples )
{
   if ( $str =~ /<newline>/ ) {
      print "\n"; next;
   }

   if ( $str =~ /$regex/ ) {
       printf ("passed - $str\n");
   }
   else {
       printf ("failed - $str\n");
   }
}

输出：

使用/^(?:(?!abc.*.com$|^def.edu$).)+.(?:com|edu)$/ s匹配

失败 - 不应该_通过失败了 - def.edu
失败了 - abc.com
失败了 - abce.com
失败了 - eabc.com

通过 - should_pass.com
通过 - a.com
通过 - b.com
通过 - ab.com
通过 - ae.com
通过 - abc.edu
通过 - def.com
通过 - defa.edu

Answer 3

这看起来像XY Problem。

DVK's answer向您展示如何使用正则表达式解决此问题，就像您要求的那样。

我的解决方案（在Python中）演示了正则表达式不一定是最好的方法，使用编程语言的字符串处理功能解决问题可能会产生更高效，更易维护的解决方案。

#!/usr/bin/env python

import unittest

def is_valid_domain(domain):
    return domain.endswith('.com') and 'abc' not in domain

class TestIsValidDomain(unittest.TestCase):

    def test_edu_invalid(self):
        self.assertFalse(is_valid_domain('def.edu'))

    def test_abc_invalid(self):
        self.assertFalse(is_valid_domain('abc.com'))
        self.assertFalse(is_valid_domain('abce.com'))
        self.assertFalse(is_valid_domain('abcAnYTHing.com'))

    def test_dotcom_valid(self):
        self.assertTrue(is_valid_domain('a.com'))
        self.assertTrue(is_valid_domain('b.com'))
        self.assertTrue(is_valid_domain('ab.com'))
        self.assertTrue(is_valid_domain('ae.com'))

if __name__ == '__main__':
    unittest.main()

See it run！

<强>更新

即使在像Perl这样的语言中，正则表达式也是惯用的，因此没有理由将所有逻辑压缩到单个正则表达式中。像这样的函数将更容易维护：

sub is_domain_valid {
    my $domain = shift;
    return $domain =~ /\.com$/ && $domain !~ /abc/;
}

（我不是Perl程序员，but this runs and gives the results that you desire）

Answer 4

您是否只想使用abc排除开始的字符串？那就是xyzabc.com会好吗？如果是这样，这个正则表达式应该起作用：

^(?!abc).+\.com$

如果您想确保abc ，请使用此

^(?:(?!abc).)+\.com$

在第一个正则表达式中，前瞻只在字符串的开头应用一次。在第二个正则表达式中，每当.即将匹配一个字符时，都会应用前瞻，确保该字符不是abc序列的开头。

在正则表达式中匹配两个单词之间的一些字符

4 个答案: