让我说我有一个正则表达式:
match = re.search(pattern, content)
if not match:
raise Exception, 'regex traceback' # i want to throw here the regex matching process.
如果正则表达式fails to match
那么我想抛出exception
它的工作和它无法匹配正则表达式模式,在什么阶段等。甚至可以实现所需的功能?
答案 0 :(得分:0)
我过去使用过Kodos(http://kodos.sourceforge.net/about.html)来执行RegEx调试。它不是理想的解决方案,因为你想要一些运行时的东西,但它可能对你有帮助。
答案 1 :(得分:0)
如果你需要测试re,你可以使用组跟随* ...,如(sometext)* 使用这个w /你想要的正则表达式,然后你应该能够找出你的失败位置
然后利用以下内容,如python.org上所述
POS 传递给RegexObject的search()或match()方法的pos值。这是RE引擎开始寻找匹配项的字符串索引。
endpos 传递给>的search()或match()方法的endpos的值。 RegexObject。这是RE引擎不会超出的字符串索引。
lastIndex的 最后匹配的捕获组的整数索引,如果没有匹配组,则为None。例如,表达式(a)b,((a)(b))和((ab))如果应用于字符串'ab',则lastindex == 1,而表达式(a)(b)将如果应用于相同的字符串,则具有lastindex == 2。
lastgroup 最后匹配的捕获组的名称,如果该组没有名称,或者根本没有匹配组,则为“无”。
重新 正则表达式对象,其match()或search()方法生成此MatchObject实例。
的字符串 传递给match()或search()的字符串。
所以这是一个非常简单的例子
>>> m1 = re.compile(r'the real thing')
>>> m2 = re.compile(r'(the)* (real)* (thing)*')
>>> if not m1.search(mytextvar):
>>> res = m2.search(mytextvar)
>>> print res.lastgroup
>>> #raise my exception
答案 2 :(得分:0)
我有一些东西可以帮助我在我的代码中调试复杂的正则表达式模式 这对你有帮助吗? :
import re
li = ('ksjdhfqsd\n'
'5 12478 abdefgcd ocean__12 ty--\t\t ghtr789\n'
'qfgqrgqrg',
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12340\n',
'2 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877',
'9 54879 bbdecddf antarctic__13 18:13pomodoro\t\t ghtr6798',
'ksjdhfqsd\n'
'5 12478 abdefgcd ocean__1247101247887 ty--\t\t ghtr789\n'
'qfgqrgqrg',
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12940\n',
'25 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877',
'9 54879 bbdeYddf antarctic__13 18:13pomodoro\t\t ghtr6798')
tupleRE = ('^\d',
' ',
'\d{5}',
' ',
'[abcdefghi]+',
' ',
'(?=[a-z\d_ ]{14} [^ ]+\t\t ght)',
'[a-z]+',
'__',
'[\d]+',
' +',
'[^\t]+',
'\t\t',
' ',
'ght',
'(r[5-9]+|u[0-4]+)',
'$')
def REtest(ch, tuplRE, flags = re.MULTILINE):
for n in xrange(len(tupleRE)):
regx = re.compile(''.join(tupleRE[:n+1]), flags)
testmatch = regx.search(ch)
if not testmatch:
print '\n -*- tupleRE :\n'
print '\n'.join(str(i).zfill(2)+' '+repr(u)
for i,u in enumerate(tupleRE[:n]))
print ' --------------------------------'
# tupleRE doesn't works because of element n
print str(n).zfill(2)+' '+repr(tupleRE[n])\
+" doesn't match anymore from this ligne "\
+str(n)+' of tupleRE'
print '\n'.join(str(n+1+j).zfill(2)+' '+repr(u)
for j,u in enumerate(tupleRE[n+1:
min(n+2,len(tupleRE))]))
for i in xrange(n):
match = re.search(''.join(tupleRE[:n-i]),ch, flags)
if match:
break
matching_portion = match.group()
matching_li = '\n'.join(map(repr,
matching_portion.splitlines(True)[-5:]))
fin_matching_portion = match.end()
print ('\n\n -*- Part of the tested string which is concerned :\n\n'
'######### matching_portion ########\n'+matching_li + '\n'
'##### end of matching_portion #####\n'
'-----------------------------------\n'
'######## unmatching_portion #######')
print '\n'.join(map(repr,
ch[fin_matching_portion:
fin_matching_portion+300].splitlines(True)) )
break
else:
print '\n SUCCES . The regex integrally matches.'
for x in li:
print ' -*- Analyzed string :\n%r' % x
REtest(x,tupleRE)
print '\nmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm'
结果
-*- Analyzed string :
'ksjdhfqsd\n5 12478 abdefgcd ocean__12 ty--\t\t ghtr789\nqfgqrgqrg'
SUCCESS . The regex integrally matches.
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12340\n'
SUCCESS . The regex integrally matches.
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'2 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877'
SUCCESS . The regex integrally matches.
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'9 54879 bbdecddf antarctic__13 18:13pomodoro\t\t ghtr6798'
SUCCESS . The regex integrally matches.
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'ksjdhfqsd\n5 12478 abdefgcd ocean__1247101247887 ty--\t\t ghtr789\nqfgqrgqrg'
-*- tupleRE :
00 '^\\d'
01 ' '
02 '\\d{5}'
03 ' '
04 '[abcdefghi]+'
05 ' '
--------------------------------
06 '(?=[a-z\\d_ ]{14} [^ ]+\t\t ght)' doesn't match anymore from this ligne 6 of tupleRE
07 '[a-z]+'
-*- Part of the tested string which is concerned :
######### matching_portion ########
'5 12478 abdefgcd '
##### end of matching_portion #####
-----------------------------------
######## unmatching_portion #######
'ocean__1247101247887 ty--\t\t ghtr789\n'
'qfgqrgqrg'
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12940\n'
-*- tupleRE :
00 '^\\d'
01 ' '
02 '\\d{5}'
03 ' '
04 '[abcdefghi]+'
05 ' '
06 '(?=[a-z\\d_ ]{14} [^ ]+\t\t ght)'
07 '[a-z]+'
08 '__'
09 '[\\d]+'
10 ' +'
11 '[^\t]+'
12 '\t\t'
13 ' '
14 'ght'
15 '(r[5-9]+|u[0-4]+)'
--------------------------------
16 '$' doesn't match anymore from this ligne 16 of tupleRE
-*- Part of the tested string which is concerned :
######### matching_portion ########
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12'
##### end of matching_portion #####
-----------------------------------
######## unmatching_portion #######
'940\n'
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'25 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877'
-*- tupleRE :
00 '^\\d'
--------------------------------
01 ' ' doesn't match anymore from this ligne 1 of tupleRE
02 '\\d{5}'
-*- Part of the tested string which is concerned :
######### matching_portion ########
'2'
##### end of matching_portion #####
-----------------------------------
######## unmatching_portion #######
'5 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877'
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'9 54879 bbdeYddf antarctic__13 18:13pomodoro\t\t ghtr6798'
-*- tupleRE :
00 '^\\d'
01 ' '
02 '\\d{5}'
03 ' '
04 '[abcdefghi]+'
--------------------------------
05 ' ' doesn't match anymore from this ligne 5 of tupleRE
06 '(?=[a-z\\d_ ]{14} [^ ]+\t\t ght)'
-*- Part of the tested string which is concerned :
######### matching_portion ########
'9 54879 bbde'
##### end of matching_portion #####
-----------------------------------
######## unmatching_portion #######
'Yddf antarctic__13 18:13pomodoro\t\t ghtr6798'
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm