Python的re.Scanner中的IGNORECASE错误?

时间:2015-09-30 05:18:51

标签: python regex case-insensitive

re模块中有隐藏但众所周知的functionality

import re

def s_ident(scanner, token): return token
def s_operator(scanner, token): return "op%s" % token
def s_float(scanner, token): return float(token)
def s_int(scanner, token): return int(token)

scanner = re.Scanner([
    (r"[a-zA-Z]\w*", s_ident),
    (r"\d+\.\d*", s_float),
    (r"\d+", s_int),
    (r"=|\+|-|\*|/", s_operator),
    (r"\s+", None),
    ])

print scanner.scan("Sum = 3*foo + 312.50 + bar")
# (['Sum', 'op=', 3, 'op*', 'foo', 'op+', 312.5, 'op+', 'bar'], '')

我想在这里使用IGNORECASE标志,但它似乎不起作用:

import re

def s_ident(scanner, token): return token
def s_operator(scanner, token): return "op%s" % token
def s_float(scanner, token): return float(token)
def s_int(scanner, token): return int(token)

scanner = re.Scanner([
    (r"(?i)[a-z]\w*", s_ident),
    (r"\d+\.\d*", s_float),
    (r"\d+", s_int),
    (r"=|\+|-|\*|/", s_operator),
    (r"\s+", None),
    ])

print scanner.scan("Sum = 3*foo + 312.50 + bar")
# ([], 'Sum = 3*foo + 312.50 + bar')

这是扫描程序的问题还是代码中的错误? 是否可以使用Scanner实现不区分大小写的匹配?

此问题最初在Python 2.7.9上重现。

预期价值: (['Sum','op =',3,'op *','foo','op +',312.5,'op +','bar'],'')

实际价值: ([],'Sum = 3 * foo + 312.50 + bar')

1 个答案:

答案 0 :(得分:0)

您可以将flags参数传递给构造函数。

scanner = re.Scanner([
    (r"[a-z]\w*", s_ident),
    (r"\d+\.\d*", s_float),
    (r"\d+", s_int),
    (r"=|\+|-|\*|/", s_operator),
    (r"\s+", None),
    ], flags=re.IGNORECASE)

Scanner的来源:https://github.com/python/cpython/blob/master/Lib/re.py#L345