问题
当我使用dateparser在字符串中搜索日期时,会得到一个元组,其中既包含日期,也包含字符串和datetime.datetime对象。我只想要该字符串,并且该字符串中有多个可能,每个分开。
关于如何将文本与结果隔离的任何想法-删除datetime.datetime对象?
原因:
我想使用变量然后在找到日期之前解析单词。
from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para[0]:
print (x)
print(type(x))
我正在寻找的是'1/03 / 19,6:00 AM和'
输出:
1/03/19 at 6:00 AM and
<class 'str'>
2019-03-01 06:00:00
<class 'datetime.datetime'>
尝试
我尝试了以下方法:
第一:
from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para[0]:
date_time = x[0]
date_string = x[1]
print(date_time)
输出:
TypeError: 'datetime.datetime' object is not subscriptable
而且,这个:
from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para[0]:
print (x(0))
输出:
TypeError: 'str' object is not callable
最后:
from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para:
date_string = x[0]
print(date_string)
print(type(date_string))
输出:
1/03/19 at 6:00 AM and
<class 'str'>
17/05/19 at 5:00 PM
<class 'str'>
答案 0 :(得分:0)
您已经指出,元组包含两个元素。字符串和日期时间对象。例如
('1/03/19 at 6:00 AM and', datetime.datetime(2019, 3, 1, 6, 0))
from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para:
date_string = x[0]
print(date_string)
您可能还希望从文本中删除和。。您可以通过剥离来做到这一点。即
date_string = x[0].strip('and')
1/03/19 at 6:00 AM
17/05/19 at 5:00 PM
如果您只想使用字符串并且要完全放弃日期时间,请使用列表推导来创建 para 变量。在下面的示例中,para填充的只是字符串列表而不是元组。日期时间被完全丢弃
para = [d[0] for d in search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})]
print(para)
# Output is just a 1D list of strings
# ['1/03/19 at 6:00 AM and', '17/05/19 at 5:00 PM']
print(para[0].strip('and'))
# Output is first string in the list with 'and' stripped off
# 1/03/19 at 6:00 AM