在以下字符串中,如何编写与时间有关的正则表达式。输出应该只是时间戳。
l1=May 30, 2012 at 8:13 AM Comment · 1Like Unlike · Bookmark Unbookmark
l2=yesterday at 12:13 AM 2Comment Like Unlike · Bookmark Unbookmark
l3=Two days ago at 01:18 AM Comment · 5Like Unlike · Bookmark Unbookmark
l4=Two days ago at 15:54 PM Comment · Like Unlike · Bookmark Unbookmark
EDIT
l5=Two days ago at 15:54:51 PM · Comment · Like Unlike · Bookmark Unbookmark
输出:
array1 = [May 30, 2012 at 8:13 AM ,yesterday at 12:13 AM ,Two days ago at 01:18 AM,Two days ago at 15:54 PM]
array2=[Comment · 1Like Unlike · Bookmark Unbookmark,2Comment · Like Unlike · Bookmark Unbookmark,Comment · 5Like Unlike · Bookmark Unbookmark,Comment · Like Unlike · Bookmark Unbookmark]
修改
p_date = re.compile(r'(\d{1,2}[:]\d{1,2}) but i wasnt sure how to do it if the timestamp was also like 23:12:29
答案 0 :(得分:2)
>>> import re
>>> pattern = r'l\d+=(.*?)·(.*)'
>>> l1 = []
>>> l2 = []
>>> for line in s.split('\n'):
m = re.match(pattern, line)
if m:
l1.append(m.groups()[0])
l2.append(m.groups()[1])
>>> l1
['May 30, 2012 at 8:13 AM ', 'yesterday at 12:13 AM ', 'Two days ago at 01:18 AM ', 'Two days ago at 15:54 PM ']
>>> l2
[' Comment \xb7 1Like Unlike \xb7 Bookmark Unbookmark', ' 2Comment \xb7 Like Unlike \xb7 Bookmark Unbookmark', ' Comment \xb7 5Like Unlike \xb7 Bookmark Unbookmark', ' Comment \xb7 Like Unlike \xb7 Bookmark Unbookmark']
>>>
编辑:添加l1=
的匹配项以将其从匹配项中删除。
答案 1 :(得分:0)
您可以按“。”拆分输出。 ,如果您的输入格式是一致的。 应用正则表达式来识别不同形式的时间戳字符串可能是一项繁忙的任务。