尝试在nltk_contrib.timex中使用ground方法时出错

时间:2017-01-25 11:45:41

标签: python nltk

import datetime
from nltk_contrib import timex

now = datetime.date.today()
basedate = timex.Date(now.year, now.month, now.day)

print timex.ground(timex.tag("Hai i would like to go to mumbai 22nd of next month"), basedate)

print str(datetime.date.day)

当我尝试运行上面的代码时,我收到以下错误

File "/usr/local/lib/python2.7/dist-packages/nltk_contrib/timex.py", line 250, in ground
    elif re.match(r'last ' + month, timex, re.IGNORECASE):
UnboundLocalError: local variable 'month' referenced before assignment

我该怎么做才能纠正这个错误?

2 个答案:

答案 0 :(得分:1)

timex模块有一个错误,在ground函数中引用全局变量而没有赋值。

要修复错误,请添加以下代码,该代码应从第171行开始:

def ground(tagged_text,base_date):

# Find all identified timex and put them into a list
timex_regex = re.compile(r'<TIMEX2>.*?</TIMEX2>', re.DOTALL)
timex_found = timex_regex.findall(tagged_text)
timex_found = map(lambda timex:re.sub(r'</?TIMEX2.*?>', '', timex), \
            timex_found)

# Calculate the new date accordingly
for timex in timex_found:
    global month # <--- here is the global reference assignment

答案 1 :(得分:0)

上面提到的将month作为全局变量添加的解决方案在连续多次调用timex时会导致其他问题,因为除非重新导入变量,否则不会重置变量。对于我来说,这是在AWS Lambda的部署环境中发生的。

一个不是很漂亮但不会引起问题的解决方案只是在地面函数中再次设置月份值:

def ground(tagged_text, base_date):

    # Find all identified timex and put them into a list
    timex_regex = re.compile(r'<TIMEX2>.*?</TIMEX2>', re.DOTALL)
    timex_found = timex_regex.findall(tagged_text)
    timex_found = map(lambda timex:re.sub(r'</?TIMEX2.*?>', '', timex), \
            timex_found)

    # Calculate the new date accordingly
    for timex in timex_found:
        month = "(january|february|march|april|may|june|july|august|september| \
        october|november|december)" # <--- reset month to the value it is set to upon import