我有一个字符串列表(我的实际列表长了很多):
Animals = ["Dog", "1", "F", "2",
"Cat", "M",
"Horse", "40", "D", "9",
"Human", "50", "E",
"Zebra", "50", "4", "P", "A", "19",
"Dog", "60", "1"]
如果有什么用,我还会列出它们自己的动物名称:
Animal_names = ["Dog", "Cat", "Horse", "Human", "Zebra"]
我想做的就是对此进行转换,使其变为以下内容(一个列表列表,该列表以动物名称开头,并以名称开头的后续值,然后在动物名称中出现之前结束) ):
[["Dog", "1", "F", "2"],
["Cat", "M"],
["Horse", "40", "D", "9"],
["Human", "50", "E"],
["Zebra", "50", "4", "P", "A", "19"],
["Dog", "60", "1"]]
有什么办法可以轻松地做到这一点吗?
答案 0 :(得分:7)
简短方法(假设输入列表以现有的动物名开头):
res = []
for a in animals:
if a in animal_names:
res.append([a])
else:
res[-1].append(a)
print(res)
输出:
[['Dog', '1', 'F', '2'], ['Cat', 'M'], ['Horse', '40', 'D', '9'], ['Human', '50', 'E'], ['Zebra', '50', '4', 'P', 'A', '19'], ['Dog', '60', '1']]
或没有else
分支的另一种方法-但这会导致大量append
调用(因此应被视为 慢 一个):
res = []
for a in animals:
if a in animal_names:
res.append([])
res[-1].append(a)
答案 1 :(得分:1)
创建一个新数组,并用值填充它,直到达到Animal_names
中存在的值,然后追加,清除并重复。
array = []
sub_array = []
for item in Animals:
# Check if the item is an animal name. If it is, add the previous animal to array
if item in Animal_names:
if len(sub_array) != 0:
array.append(sub_array)
sub_array = []
sub_array.append(item)
array.append(sub_array) # get the last animal too
答案 2 :(得分:1)
这应该是可行的解决方案:D。它包括循环遍历大列表中的元素,并在每次发现新动物时开始创建新列表。
final_list=[]
animal_list=[]
for animal in Animals:
if animal in Animal_names:
if len(animal_list):
final_list.append(animal_list)
animal_list=[animal]
else:
animal_list.append(animal)
final_list.append(animal_list)
final_list
答案 3 :(得分:0)
这是zip
的解决方案,但它需要用于查找索引的初始循环。
animal_indices = [
i
for i, name in enumerate(animals)
if name in animal_names
]
animal_structured = [
animals[i:j]
for i, j in zip(animal_indices, animal_indices[1:]+[len(animals)])
]
答案 4 :(得分:0)
大量数据?如果速度有问题,请考虑使用numpy
。我会按照以下方式进行操作:
import numpy as np
animals = np.array(["Dog", "1", "F", "2", "Cat", "M", "Horse", "40", "D", "9", "Human", "50", "E", "Zebra", "50", "4", "P", "A", "19", "Dog", "60", "1"])
animal_names = np.array(["Dog", "Cat", "Horse", "Human", "Zebra"])
split_at = [inx for inx,i in enumerate(animals) if i in animal_names and inx>0]
output = np.split(animals,split_at) # now output is list of np.arrays
output_list = [list(a) for a in output] # conversion to list of lists
print(output_list)
输出:
[['Dog', '1', 'F', '2'], ['Cat', 'M'], ['Horse', '40', 'D', '9'], ['Human', '50', 'E'], ['Zebra', '50', '4', 'P', 'A', '19'], ['Dog', '60', '1']]
此方法的成本是您必须安装numpy
才能使其正常运行,获得的好处是它可能更快,尤其对于大量数据可见,因此如果您关心关于处理时间,我建议对各种方法进行一些比对测试,它们的大小与您想要的用例相似。