正则表达式将文本匹配为多个组

时间:2018-12-12 15:44:28

标签: python regex

我正在尝试设置一个正则表达式以匹配文本,并且我希望一个特定的字符串与其他文本(如果存在)相匹配。

例如,如果我的字符串是this is a test,我希望this is a匹配第一组,test匹配第二组。我正在使用python regex库。这是我想要的结果的更多示例

  • this is a test-组1:this is a,组2:test

  • one day at a time-组1:one day at a time,组2:

  • one day test is-组1:one day,组2:test

  • testing, 1,2,3-不匹配

  • this is not a drill-组1:this is not a drill,组2:

在这种情况下,我在第二组中匹配的特定字符串是test。我不确定如何设置正则表达式来正确匹配这些特殊情况。

2 个答案:

答案 0 :(得分:1)

您可以尝试以下正则表达式:

^(this.*?)(test)?$

正则表达式的解释:

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    this                     'this'
--------------------------------------------------------------------------------
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (                        group and capture to \2 (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    test                     'test'
--------------------------------------------------------------------------------
  )?                       end of \2 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \2)
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

答案 1 :(得分:1)

您可以尝试与此伴侣

^(?:(?!test))(?:(.*)(?=\btest\b)(\btest\b)|(.*))
  

说明

  • ^(?:(?!test))-负面的展望。从测试开始就没有任何匹配。
  • (.*)-匹配除换行符以外的所有内容。
  • (?=\btest\b)-正向前进。在单词边界之间匹配test
  • (\btest\b)-捕获组匹配test
  • |-交替与逻辑OR相同。
  • (.*)-匹配除换行符以外的所有内容。

Demo