通过正则表达式将列表拆分为列表列表

时间:2015-07-07 07:30:36

标签: python regex

我想将字符列表拆分为列表列表,其中拆分点由成功的正则表达式匹配定义。

例如,假设我有一个输入列表:

["file1","A","B","C","file2","D","E","F","G","H","I"]

我想制作:

[["file1","A","B","C"],["file2","D","E","F","G","H","I"]]

通过成功匹配确定分割点file1file2

re.search("file[0-9]+",<TEST STRING>)

预先不知道每个分割点之间的项目数量,也不知道有多少&lt; fileXXX&#39;术语在原始载体中。

实际上,我的正则表达式比这复杂得多,这不是关注点,我需要帮助的是什么,如果有人如此善良,是Pythonic执行拆分逻辑的方法吗?

3 个答案:

答案 0 :(得分:3)

假设第一个元素是一个合适的标题。如果没有,你需要做一些防御性条款。

LatLng NEWARK = new LatLng(40.714086, -74.228697);
GroundOverlayOptions newarkMap = new GroundOverlayOptions()
        .image(BitmapDescriptorFactory.fromResource(R.drawable.newark_nj_1922))
        .position(NEWARK, 8600f, 6500f);
map.addGroundOverlay(newarkMap);

答案 1 :(得分:1)

以下应该可以很好地运作:

import re

input_list = ["file1","A","B","C","file2","D","E","F","G","H","I"]
output_list = []

for item in input_list:
    if re.match("file[0-9]+", item):
        output_list.append([item])
    else:
        output_list[-1].append(item)

print output_list

给出以下结果:

[['file1', 'A', 'B', 'C'], ['file2', 'D', 'E', 'F', 'G', 'H', 'I']]

注意,这假设第一项是匹配。

<强>更新

第二种方法可能是:

input_list = ["1", "2", "file1","A","B","C","file2","D","E","F","G","H","I"]
output_list = []

for item in input_list:
    if re.match("file[0-9]+", item) or len(output_list) == 0:
        output_list.append([item])
    else:
        output_list[-1].append(item)

print output_list

这也可以应对非初始匹配案例:

[['1', '2'], ['file1', 'A', 'B', 'C'], ['file2', 'D', 'E', 'F', 'G', 'H', 'I']]

答案 2 :(得分:0)

您可以找到file\d的索引:

indeces = list(i for i,val in enumerate(my_list) if match('file\d', val))

然后简单地按这些索引进行分组:

output = [my_list[indeces[0]:indeces[1]], my_list[indeces[1]:]]
>>> from re import match
>>> my_list = ["file1","A","B","C","file2","D","E","F","G","H","I"]
>>> indeces = list(i for i,val in enumerate(my_list) if match('file\d', val))
>>> [my_list[indeces[0]:indeces[1]], my_list[indeces[1]:]]
[['file1', 'A', 'B', 'C'], ['file2', 'D', 'E', 'F', 'G', 'H', 'I']]