Question

问题

当我使用dateparser在字符串中搜索日期时，会得到一个元组，其中既包含日期，也包含字符串和datetime.datetime对象。我只想要该字符串，并且该字符串中有多个可能，每个分开。

关于如何将文本与结果隔离的任何想法-删除datetime.datetime对象？

原因：

我想使用变量然后在找到日期之前解析单词。

from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para[0]:
    print (x)
    print(type(x))

我正在寻找的是'1/03 / 19，6：00 AM和'

输出：

1/03/19 at 6:00 AM and
<class 'str'>
2019-03-01 06:00:00
<class 'datetime.datetime'>

尝试

我尝试了以下方法：

第一：

from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para[0]:
    date_time = x[0]
    date_string =  x[1]
    print(date_time)

输出：

TypeError: 'datetime.datetime' object is not subscriptable

而且，这个：

from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para[0]:
    print (x(0))

输出：

TypeError: 'str' object is not callable

最后：

from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para:
    date_string =  x[0]
    print(date_string)
    print(type(date_string))

输出：

1/03/19 at 6:00 AM and
<class 'str'>
17/05/19 at 5:00 PM
<class 'str'>

Answer 1

您已经指出，元组包含两个元素。字符串和日期时间对象。例如

('1/03/19 at 6:00 AM and', datetime.datetime(2019, 3, 1, 6, 0))

您可以通过索引元组来仅隔离字符串。

例如

from dateparser.search import search_dates
para = search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})
for x in para:
    date_string =  x[0]
    print(date_string)

您可能还希望从文本中删除和。。您可以通过剥离来做到这一点。即

date_string = x[0].strip('and')

输出

1/03/19 at 6:00 AM 
17/05/19 at 5:00 PM

如果您只想使用字符串并且要完全放弃日期时间，请使用列表推导来创建 para 变量。在下面的示例中，para填充的只是字符串列表而不是元组。日期时间被完全丢弃

para = [d[0] for d in search_dates("Competition opens 1/03/19 at 6:00 AM and closes 17/05/19 at 5:00 PM", settings={'STRICT_PARSING': True, 'DATE_ORDER': 'DMY'})]
print(para)
# Output is just a 1D list of strings
# ['1/03/19 at 6:00 AM and', '17/05/19 at 5:00 PM']
print(para[0].strip('and'))
# Output is first string in the list with 'and' stripped off
# 1/03/19 at 6:00 AM

如何使用dateparser从字符串中提取实际日期？

1 个答案:

例如

输出