我有一个如下所示的列表:
mylist = ['name','mem','g1','g2','g3','foo','bar','qux','zoo','name','mem','foo','bar','qux','zoo']
我们可以看到上面的字符串被分成两部分,由'name','mem'
我想要做的是获取两个列表,其中每个列表在mylist中包含foo...zoo
的索引。
导致这个
firstpart_vals_id = [5,6,7,8]
secondpart_vals_id = [11,12,13,14]
我如何在Python中实现这一目标?
mylist
中的所有内容都是固定的,但foo....zoo
的数量可能不同,但foo....zoo
部分的长度和内容对于两个部分是相同的(对称)。
更新:我尝试使用正则表达式解决方案。
>>> from itertools import groupby
>>> import re
>>> mj = re.compile(r'^val(\d+)$')
>>> mylist = ['name','mem','g1','g2','g3','val1','val2','val3','val4','name','mem','val1','val2','val3','val4']
>>> [[x[0] for x in g] for k, g in groupby(enumerate(mylist), key= lambda x: mj.search(x[1].mj)) if k]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <lambda>
AttributeError: 'str' object has no attribute 'mj'
答案 0 :(得分:4)
您可以使用itertools.groupby
:
>>> from itertools import groupby
>>> mylist = ['name','mem','g1','g2','g3','val1','val2','val3','valN','name','mem','val1','val2','val3','valN']
>>> [[x[0] for x in g] for k, g in groupby(
enumerate(mylist), key= lambda x:x[1].startswith('val')) if k]
[[5, 6, 7, 8], [11, 12, 13, 14]]
请注意,我在这里使用了一个简单的str.startswith
条件,如果需要,可以用正则表达式替换它。
使用正则表达式:
import re
mylist = ['name','mem','g1','g2','g3','val1','val2','val3','val1','name','mem','val1','val2','val3','val4']
mj = re.compile(r'^val\d+$')
print [[x[0] for x in g] for k, g in groupby(
enumerate(mylist), key=lambda x: bool(mj.search(x[1]))) if k]
[[5, 6, 7, 8], [11, 12, 13, 14]]
答案 1 :(得分:1)
您可以使用列表推导来执行所需的基本步骤(序列的映射和过滤)。可能有几种方法可以完成工作,下面的代码是单向的(N.B.我还没有测试过)。
# first find every occurence of "name", we just ignore "map".
name_indices = [i for (i, s) in enumerate(mylist) if s == 'name']
name_indices.sort() # probably redunant, but we are going to rely on sorting later.
# do something similar, but now we don't care about ordering so use a set.
# you can use some other sequence type if you prefer. Of course we can use
# any condition we choose. not just s.startswith()
val_indices = set(i for (i, s) in enumerate(mylist) if s.startswith('val'))
# we want to build a dictionary of Name index to all value indices following it.
nv_map = {}
for ni, ni_next in zip(name_indices[0:-1], indices[1:]):
# ni should be a name index, an ni_next should the next higher one.
# so insert all val_indices in that range into an element of nv_map
nv_map[ni] = set(i for i in val_indices if i >= ni and i < ni_next)
因此,我们希望nv_map
的结果为
{
0 : {5,6,7,8},
9 : {11,12,13,14}
}