Question

这个有点复杂，有点偏离我的联赛。我想对单词列表进行排序，并删除那些不包含特定字符集的单词，但是这些字符可以按任何顺序排列，有些可能比其他字符更多。

我希望正则表达式能够找到任何单词：

e 0或1次
a 0或1次
t 0或1或2次

例如，以下内容可行：

eat tea tate tt a e

以下不起作用

eats teas tates ttt aa ee

Lookaround Regex对我来说是新手，所以我对语法并不是100％肯定（使用带有解释的环视的任何答案都会很棒）。到目前为止我最好的猜测：

Regex regex = new Regex(@"(?=.*e)(?=.*a)(?=.*t)");
lines = lines.Where(x => regex.IsMatch(x)).ToArray(); //'text' is array containing words

Answer 1

不确定

\b(?:e(?!\w*e)|t(?!(?:\w*t){2})|a(?!\w*a))+\b

<强>解释

\b             # Start of word
(?:            # Start of group: Either match...
 e             # an "e",
 (?!\w*e)      # unless another e follows within the same word,
|              # or
 t             # a "t",
 (?!           # unless...
  (?:\w*t){2}  # two more t's follow within the same word,
 )             # 
|              # or
 a             # an "a"
 (?!\w*a)      # unless another a follows within the same word.
)+             # Repeat as needed (at least one letter)
\b             # until we reach the end of the word.

测试live on regex101.com。

（为了简单起见，我使用了\w字符类;如果你想定义允许的＆＃34;单词字符＆＃34;不同，请相应地替换它）

Answer 2

这可能与其他人一样，我没有格式化那些找出来。

请注意，断言是强制匹配的，它们不能是可选的（除非专门设置为可选项，但适用于什么？）并且不会直接受回溯影响。

这是有效的，解释是在格式化的正则表达式中。

<强>更新
要使用空白边界，请使用：

(?<!\S)(?!\w*(?:e\w*){2})(?!\w*(?:a\w*){2})(?!\w*(?:t\w*){3})[eat]+(?!\S)

格式化：

 (?<! \S )
 (?!
      \w* 
      (?: e \w* ){2}
 )
 (?!
      \w* 
      (?: a \w* ){2}
 )
 (?!
      \w* 
      (?: t \w* ){3}
 )
 [eat]+ 
 (?! \S )

要使用普通的单词边界，请使用：

\b(?!\w*(?:e\w*){2})(?!\w*(?:a\w*){2})(?!\w*(?:t\w*){3})[eat]+\b

格式化：

 \b                     # Word boundary
 (?!                    # Lookahead, assert Not 2 'e' s
      \w* 
      (?: e \w* ){2}
 )
 (?!                    #  Lookahead, assert Not 2 'a' s
      \w* 
      (?: a \w* ){2}
 )
 (?!                    #  Lookahead, assert Not 3 't' s
      \w* 
      (?: t \w* ){3}
 )
 # At this point all the checks pass, 
 # all thats left is to match the letters.
 # -------------------------------------------------

 [eat]+                 # 1 or more of these, Consume letters 'e' 'a' or 't'
 \b                     # Word boundary

正则表达式限制具有特定字母组合的单词（以任何顺序）

2 个答案: