Question

我有一种模式可以在字符串中找到一些单词等。这是我的代码：

    pattern = {
        "eval\(.*\)",
        "hello",
        "my word"
    }

    patterns = "|" . join( pattern )
    patterns = "(^.*?(" + patterns + ").*?$)"

    code = code.strip()

    m = re.findall( patterns, code, re.IGNORECASE|re.MULTILINE|re.UNICODE )

    if m:
        return m

我怎样才能看到这些单词中的哪一个（eval（），你好......）？在php中我有函数preg_match_all来获取找到的匹配单词。

Answer 1

我不知道它是否符合您的意图，但您的正则表达式有两个级别的捕获组：

    (^.*?(hello|my word|eval\(.*\)).*?$)

外部组将捕获整行，而内部组将仅捕获指定的单词。

re.findall方法返回包含捕获组的元组列表。在您的特定情况下，这将是：

    [(outer_group, inner_group), (outer_group, inner_group), ...]

要迭代这个，你可以这样做：

    for line, words in m:
        print('line:', line)
        print('words:', words)

或直接访问这些项目，请执行以下操作：

    line = m[0][0]
    words = m[0][1]

NB：

如果移除外部组，或者不捕获外部组，请执行以下操作：

    ^.*?(hello|my word|eval\(.*\)).*?$

或者

    (?:^.*?(hello|my word|eval\(.*\)).*?$)

只有一个捕获组。对于这种特定情况，re.findall将返回匹配的平面列表（即只是单个字符串，而不是元组）。

Answer 2

pattern = {
    "eval\(.*\)",
    "hello",
    "my word"
}
patterns = "|" . join( pattern )
patterns = "^.*?(" + patterns + ").*?$"

code = "i say hello to u"

m = re.match( patterns, code, re.IGNORECASE|re.MULTILINE|re.UNICODE )

if m:
    print m.group()  #the line that matched
    print m.group(1) #the word that matched

您需要match代替findall。

match.group会为您提供匹配的整行，match.group(1)或match.group(2)会为您提供相应的字词。

如何找出匹配的单词？

2 个答案: