Question

如何使用re？

匹配以下模式

2016-02-13 02:00:00.0,3525,http://www.heatherllindsey.com/2016/02/my-husband-left-his-9-5-job-for-good-it.html,158,0,2584490

我使用python的split()函数将属性分开，但由于数据量很大，因内存错误导致进程被杀死。

Answer 1

如果你把长版本的字符串放在一起会更好。那么你怎么能做到呢？答案是这样的：

import re
str = "2016-02-13 02:00:00.0,3525,http://www.heatherllindsey.com/2016/02/my-husband-left-his-9-5-job-for-good-it.html,158,0,2584490"
pattern = re.compile("(.*?),", re.DOTALL) #we use re.DOTALL to continue splitting after endlines.
result = pattern.findall(str) #we can't find the last statement (2584490) because of the pattern so we will apply second process
pattern = re.compile("(.*?)", re.DOTALL)
str2 = str[-50:-1]+str[-1] #we take last partition of string to find out last statement by using split() method

result.append(str2.split(",")[-1])
print result

它有效......

如何使用逗号重新匹配模式？

1 个答案: