例如,我有以下2d数组:
ls = [
[1,2,3,4,'A',5],
[1,2,3,4,'A',5],
[1,2,3,4,'A',5],
[-1,-2,-3,-4,'B',-5],
[-1,-2,-3,-4,'B',-5],
[-1,-2,-3,-4,'B',-5]
]
我想选择ls
的第1,第3,第4列,并将每列保存到新列表中。此外,我希望选择第5栏的条件,即检查'A'
或'B'
,如下:
la1 = [int(x[0]) for x in ls if 'A' in x[4]]
la2 = [int(x[2]) for x in ls if 'A' in x[4]]
la3 = [float(x[3]) for x in ls if 'A' in x[4]]
lb1 = [int(x[0]) for x in ls if 'B' in x[4]]
lb2 = [int(x[2]) for x in ls if 'B' in x[4]]
lb3 = [float(x[3]) for x in ls if 'B' in x[4]]
我知道我的实现在大型数组中效率不高。有没有更好的实施? 谢谢大家的帮助!!!
答案 0 :(得分:1)
您可以将6个列表推导合并为两个:
la1, la2, la3= zip(*((x[0], x[2], float(x[3])) for x in ls if 'A' in x[4]))
lb1, lb2, lb3= zip(*((x[0], x[2], float(x[3])) for x in ls if 'B' in x[4]))
首先创建一个3元组(x[0], x[2], float(x[3]))
的列表,然后使用旧的zip(*values)
技巧转置它并将其解压缩到la1, la2, la3
个变量中。
比这更有效的是一个简单的循环:
la1, la2, la3 = [], [], []
lb1, lb2, lb3 = [], [], []
for x in ls:
if 'A' in x[4]:
la1.append(x[0])
la2.append(x[2])
la3.append(float(x[3]))
if 'B' in x[4]:
lb1.append(x[0])
lb2.append(x[2])
lb3.append(float(x[3]))
答案 1 :(得分:1)
您可以尝试使用numpy,它是python的高效数组库:
import numpy as np
ls = np.array([ # wrap ls into numpy array
[1,2,3,4,'A',5],
[1,2,3,4,'A',5],
[1,2,3,4,'A',5],
[-1,-2,-3,-4,'B',-5],
[-1,-2,-3,-4,'B',-5],
[-1,-2,-3,-4,'B',-5]
])
a_rows = ls[:,4] == 'A' # select rows with A in 4-th column
b_rows = ls[:,4] == 'B'
col_1 = ls[:,0] # select first column
col_3 = ls[:,2]
col_4 = ls[:,3]
la1 = col_1[a_rows] # first column with respect to the rows with A
la2 = col_3[a_rows]
la3 = col_4[a_rows]
lb1 = col_1[b_rows]
lb2 = col_3[b_rows]
lb3 = col_4[b_rows]
答案 2 :(得分:0)
我认为如果你有很多这样的列表,那么将你的列表存储在字典中是明智的,因为你可以根据条件分割数据,因此for循环也可能更快:
d = {'la1': [],
'la3': [],
'la4': [],
'lb1': [],
'lb3': [],
'lb4': []}
ls = [[1,2,3,4,'A',5],
[1,2,3,4,'A',5],
[1,2,3,4,'A',5],
[-1,-2,-3,-4,'B',-5],
[-1,-2,-3,-4,'B',-5],
[-1,-2,-3,-4,'B',-5]]
for sublist in ls:
if sublist[4] == "A":
d['la1'].append(int(sublist[0]))
d['la3'].append(int(sublist[2]))
d['la4'].append(float(sublist[3]))
elif sublist[4] == "B":
d['lb1'].append(int(sublist[0]))
d['lb3'].append(int(sublist[2]))
d['lb4'].append(float(sublist[3]))
print (d)
#{'lb4': [-4.0, -4.0, -4.0], 'lb1': [-1, -1, -1], 'la3': [3, 3, 3], 'la4': [4.0, 4.0, 4.0], 'la1': [1, 1, 1], 'lb3': [-3, -3, -3]}
答案 3 :(得分:0)
使用numpy数组 它们比普通列表更快 尝试运行下面提供的每行代码
ls = np.array([[1,2,3,4,'A',5],[1,2,3,4,'A',5],[1,2,3,4,'A',5],[-1,-2,-3,-4,'B',-5],[-1,-2,-3,-4,'B',-5],[-1,-2,-3,-4,'B',-5]])
filterA = (ls[:,4]=='A')
filterB = (ls[:,4]=='B')
newarrayA=ls[filterA]
newarrayB=ls[filterB]
selectedcolumnsA=newarrayA[:,(0,2,3)]
selectedcolumnsB=newarrayB[:,(0,2,3)]
la1,la2,la3=selectedcolumnsA[:,0],selectedcolumnsA[:,1],selectedcolumnsA[:,2]
lb1,lb2,lb3=selectedcolumnsB[:,0],selectedcolumnsB[:,1],selectedcolumnsB[:,2]
希望它有所帮助。如果您对此感到不舒服,请尝试学习numpy。它将来一定会帮助您。