我正在使用Python的parsedatetime
库来解析自然语言的日期时间。它将自然语言解析为许多场景的日期时间。例如next Monday at 5PM
,`下个月等
但是在自然语言中给出day after tomorrow
或day before yesterday
时,却无法理解。
例如,'后天'取明天的日期时间。
以下是代码段:
from datetime import datetime
import parsedatetime as pdt
plain_text='day after tomorrow' # Natural Language input
str_parsed_date_time = ''
cal = pdt.Calendar()
now = datetime.now()
for time_string in [plain_text]:
parsed_date_time = (cal.parseDT(time_string, now)[0])
str_parsed_date_time = datetime.strftime(parsed_date_time, '%Y-%m-%d %H:%M:%S') # Convert date time to string
print(str_parsed_date_time)
今天的日期是April 18th 2017 (2017-04-18)
图书馆输出2017-04-19而不是2017-04-20
可能是什么原因?
答案 0 :(得分:1)
parsedatetime
期望在其单位面前有一个数量。因此,它会成功解析a day after tomorrow
之类的内容,但不会day after tomorrow
。
测试代码:
import parsedatetime as pdt
test_text = [
'day after tomorrow',
'the day after tomorrow',
'a day after tomorrow',
'an day after tomorrow',
'one day after tomorrow',
'two day after tomorrow',
]
cal = pdt.Calendar()
for time_string in test_text:
result = cal.nlp(time_string)[0]
print("Got: %s from:'%s' original:'%s'" % (
result[0].date(), result[-1], time_string))
<强>结果:强>
Got: 2017-04-20 from:'after tomorrow' original:'day after tomorrow'
Got: 2017-04-20 from:'after tomorrow' original:'the day after tomorrow'
Got: 2017-04-21 from:'a day after tomorrow' original:'a day after tomorrow'
Got: 2017-04-21 from:'an day after tomorrow' original:'an day after tomorrow'
Got: 2017-04-21 from:'one day after tomorrow' original:'one day after tomorrow'
Got: 2017-04-22 from:'two day after tomorrow' original:'two day after tomorrow'
请注意,在前两个结果中,from
字符串与original
字符串不匹配,因为返回的字符串是解析中使用的字符串,parsedatetime
确实识别了day
的单位,因此忽略了它。