正则表达测试仪

时间:2016-03-25 17:02:28

标签: python regex python-3.x

有6行输入。第一行将包含10个字符串。最后5行将包含有效的正则表达式字符串 对于输出,每个正则表达式根据第1行打印与字符串匹配的所有字符串;如果没有匹配则打印none #用于表示空字符串

示例输入:

1)#,aac,acc,abc,ac,abbc,abbbc,abbbbc,aabc,accb 
2)a.c 
3)a[ab]c 
4)a[^ab]c 
5)ab*c 
6)ab{2,4}c 

示例输出:

1)aac, acc, abc 
2)aac, abc
3)acc 
4)ac,abc,abbc,abbbc,abbbbc 
5)abbc,abbc,abbbbc

为什么此代码不起作用:

import re
inp = input("Search String:").upper().split()
for runs in range(5):
    temp = []
    query = input("Search Query:").replace("*", ".*").replace("?", "[A-Z0-9]+?+$").upper()
    for item in inp:
        search = re.search(query, item)
        if search: # means match
            temp.append(item)
    if len(temp) > 0:
        print(" ".join(temp))
    else:
        print("No Match")

2 个答案:

答案 0 :(得分:1)

我认为你正在将正则表达式与globs混为一谈,这表明你并不是一个正如你所说的那样的正则表达式初学者。也就是说,这里有一些代码显示了glob和regex样式之间的差异。

import re

EXAMPLE_INPUT = """
1)#,aac,abc,ac,abbc,abbbc,abbbbc,aabc,accb,ab4c
2)a.c
3)a[ab]c
4)a[^ab]c
5)ab*c
6)ab{2,4}c
"""

lines = [x[2:] for x in map(str.strip, EXAMPLE_INPUT.strip().split('\n'))]
search_strings = [ l if l != '#' else '' for l in lines[0].split(',')]
patterns = lines[1:]

for pat in patterns:
    glob = pat.replace('.', r'\.').replace('*', r'.*').replace('?', r'.')
    # should also do:  .replace('[^', '[!') but you used ^ everywhere
    glob = re.sub(r'{([^}]*)}', lambda m: '(' + m.group(1).replace(',', '|') + ')', glob)
    for ss in search_strings:
        if re.search(pat, ss):
            print("Regex '{}' matches '{}'".format(pat, ss))
        if re.search(glob, ss):
            print("Glob '{}' matches '{}'".format(pat, ss))

输出是:

Regex 'a.c' matches 'aac'
Regex 'a.c' matches 'abc'
Regex 'a.c' matches 'aabc'
Regex 'a.c' matches 'accb'
Regex 'a[ab]c' matches 'aac'
Glob 'a[ab]c' matches 'aac'
Regex 'a[ab]c' matches 'abc'
Glob 'a[ab]c' matches 'abc'
Regex 'a[ab]c' matches 'aabc'
Glob 'a[ab]c' matches 'aabc'
Regex 'a[^ab]c' matches 'accb'
Glob 'a[^ab]c' matches 'accb'
Regex 'ab*c' matches 'aac'
Regex 'ab*c' matches 'abc'
Glob 'ab*c' matches 'abc'
Regex 'ab*c' matches 'ac'
Regex 'ab*c' matches 'abbc'
Glob 'ab*c' matches 'abbc'
Regex 'ab*c' matches 'abbbc'
Glob 'ab*c' matches 'abbbc'
Regex 'ab*c' matches 'abbbbc'
Glob 'ab*c' matches 'abbbbc'
Regex 'ab*c' matches 'aabc'
Glob 'ab*c' matches 'aabc'
Regex 'ab*c' matches 'accb'
Glob 'ab*c' matches 'ab4c'
Regex 'ab{2,4}c' matches 'abbc'
Regex 'ab{2,4}c' matches 'abbbc'
Regex 'ab{2,4}c' matches 'abbbbc'
Glob 'ab{2,4}c' matches 'ab4c'

答案 1 :(得分:1)

您忘记将,写为split()的参数。

此外,在匹配完整字符串时,比较search是否为None是不够的。这是因为re.search()在整个文本中找到了模式,这意味着它将找到与模式匹配的文本的所有子串(我假设您只想要从头开始的子串匹配)。

要解决此问题,我们可以使用re.match()代替re.search()

import re

inp = input("Search String:").upper().split(',')

for runs in range(5):
    temp = []
    query = input("Search Query:").replace("*", ".*").replace("?", "[A-Z0-9]+?+$").upper()
    for item in inp:
        search = re.match(query, item)
        if search:
            if search.group() not in temp:
                temp.append(search.group())
    if len(temp) > 0:
        print(" ".join(temp))
    else:
        print("No Match")