python重新匹配无匹配行为

时间:2015-09-24 17:14:25

标签: python regex

致以亲爱的社区。 似乎当re.match找不到匹配时,它显示错误。这是一个例外吗?我使用spyder作为我的IDE并从那里运行代码

import re
import sys
def extract_year(line):
    #mat=re.search(r'Popularity\s+in\s+\d{4}',line)
    #mat=re.search(r'Popularity\s+in\s+[(19)(20)]\d{2}',line)   

    mat=re.search(r'Popularity\s+in\s+(19[0-9][0-9]|200[0-9]|201[0-9])',line)
    """
    if  __debug__:
        print mat.group(1)
        print mat.group()
    """        
    try:
        return mat.group(1)
    #print mat.group(2)
    except:
        e = sys.exc_info()  
        print e
        return ""

extract_year(' <h3 align="center">Popularity in 1898</h3>')
extract_year(' <h3 align="center">Popularity in 2018</h3>')   
extract_year(' <h3 align="center">Popularity in 1988</h3>')   
extract_year('cellpadding="2 cellspacing="0 summary="Popularity for top 1000"><caption><h2>Popularity in 1908</h2></caption>') 

为什么我在控制台输出中得到这个?没有匹配时是否抛出异常?如果是这样,为什么它在try块中没有捕获它?

 extract_year(' <h3 align="center">Popularity in 1898</h3>')

    **(<type 'exceptions.AttributeError'>, AttributeError("'NoneType' object has no attribute 'group'",), <traceback object at 0x0000000013739748>)
    Out[98]: ''**

    extract_year(' <h3 align="center">Popularity in 2018</h3>')   

    Out[99]: '2018'

    extract_year(' <h3 align="center">Popularity in 1988</h3>')   

    Out[100]: '1988'

    extract_year('cellpadding="2 cellspacing="0 summary="Popularity for top 1000"><caption><h2>Popularity in 1908</h2></caption>') 

    Out[101]: '1908'

3 个答案:

答案 0 :(得分:0)

如果没有匹配,则re.search()返回None,因此您无法访问其.group方法。

答案 1 :(得分:0)

当你得不到匹配时,search返回None所以试图在None上调用group显然会失败并引发异常,你正在捕获该异常,所以为什么你是看到'exceptions.AttributeError'...,因为您正在打印e,即除了没有看到错误之外的sys.exc_info()

答案 2 :(得分:-1)

如果无法找到匹配项,

re.search将返回None。

import re


def extract_year(line):
    """Write a docstring for your function"""
    mat = re.search(r'Popularity in (\d{4})', line)
    if mat is None:
        raise ValueError('No year found.')
    else:
        return mat.group(1)

print extract_year(' <h3 align="center">Popularity in 1898</h3>')
print extract_year(' <h3 align="center">Popularity in 2018</h3>')
print extract_year(' <h3 align="center">Popularity in 1988</h3>')
print extract_year('cellpadding="2 cellspacing="0 summary="Popularity for top 1000"><caption><h2>Popularity in 1908</h2></caption>')
print extract_year('no year here')