正则表达式-我有一个字符串列表,其中一个字符串是日期和时间,但我想从列表中删除日期字段和空白。
这是我的输入列表:
['Hello how are you',
'',
'fine',
'',
'had you break fast',
'',
'I had 1',
'',
'2016-06-11 5:06 PM',
'',
'Are you going to school today ',
'No!',
'',
'What? You gave ',
'I given money.',
'',
'2',
'',
'money 2',
'',
'2016-06-11 5:08 PM',
'']
所需的输出格式:处理后列出
['Hello how are you',
'fine',
'had you break fast',
'I had 1',
'Are you going to school today ',
'No!',
'What? You gave ',
'I given money.',
'2',
'money 2']
答案 0 :(得分:1)
import re
dirty_list = ['Hello how are you', '', 'fine', '', 'had you break fast', '', 'I had 1', '', '2016-06-11 5:06 PM', '', 'Are you going to school today ', 'No!', '', 'What? You gave ', 'I given money.', '', '2', '', 'money 2', '', '2016-06-11 5:08 PM', '']
clean_list= []
for i in dirty_list:
if i != '' and not re.search('\d{4}\-\d{2}\-\d{2}\s+\d{1,2}\:\d{2}.*',i):
clean_list.append(i)
这应该做到。它基本上会忽略列表中的空项目和日期格式。
输出:
print(clean_list)
['Hello how are you', 'fine', 'had you break fast', 'I had 1', 'Are you going to school today ', 'No!', 'What? You gave ', 'I given money.', '2', 'money 2']