在Python中删除数组列表中的空数和零数组

时间:2013-12-05 13:14:09

标签: python arrays sorting numpy filtering

我正在考虑一些Python数据,这些数据是表格中的数组列表:

LA=
[array([  99.08322813,  253.42371683,  300.792029  ])
array([  51.55274095,  106.29707418,  0])
array([0, 0 ,0 , 0, 0])
array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493, 453.56783459])
array([ 105.61643877,  442.76668729,  450.37335607])
array([ 348.84179544])
array([], dtype=float64)]
array([0, 0 , 0])
array([ 295.05603151,  0,  451.77083268,  500.81771919])
array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919])
array([  91.86758237,  148.70156948,  488.70648486,  507.31389766])
array([ 353.68691095])
array([ 208.21919198,  246.57665959,  0,  251.33820305, 394.34266882])
array([], dtype=float64)]

在我的数据中,我得到了一些 空数组:

array([], dtype=float64)] 

和填充零的数组:

array([0, 0, 0])

如何以自动化的简单方式摆脱这两种阵列

LA=
[array([  99.08322813,  253.42371683,  300.792029  ])
array([  51.55274095,  106.29707418,  0])
array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493, 453.56783459])
array([ 105.61643877,  442.76668729,  450.37335607])
array([ 348.84179544])
array([ 295.05603151,  0,  451.77083268,  500.81771919])
array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919])
array([  91.86758237,  148.70156948,  488.70648486,  507.31389766])
array([ 353.68691095])
array([ 208.21919198,  246.57665959,  0,  251.33820305, 394.34266882])

最后,我想删除零,同时保持数组列表格式

LA=
[array([  99.08322813,  253.42371683,  300.792029  ])
array([  51.55274095,  106.29707418])
array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493, 453.56783459])
array([ 105.61643877,  442.76668729,  450.37335607])
array([ 348.84179544])
array([ 295.05603151,  451.77083268,  500.81771919])
array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919])
array([  91.86758237,  148.70156948,  488.70648486,  507.31389766])
array([ 353.68691095])
array([ 208.21919198,  246.57665959,  251.33820305, 394.34266882])

提前致谢

2 个答案:

答案 0 :(得分:5)

使用NumPy和列表理解:

>>> from numpy import *

解决方案1:

>>> [x[x!=0] for x in LA if len(x) and len(x[x!=0])]                          
[array([  99.08322813,  253.42371683,  300.792029  ]),                                           
 array([  51.55274095,  106.29707418]),                                                          
 array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493,                              
        453.56783459]),                                                                          
 array([ 105.61643877,  442.76668729,  450.37335607]),                                           
 array([ 348.84179544]),                                                                         
 array([ 295.05603151,  451.77083268,  500.81771919]),                                           
 array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919]),                            
 array([  91.86758237,  148.70156948,  488.70648486,  507.31389766]),                            
 array([ 353.68691095]),                                                                         
 array([ 208.21919198,  246.57665959,  251.33820305,  394.34266882])]    

解决方案2:

>>> [x[x!=0] for x in LA if count_nonzero(x)]                          
[array([  99.08322813,  253.42371683,  300.792029  ]),                                           
 array([  51.55274095,  106.29707418]),                                                          
 array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493,                              
        453.56783459]),                                                                          
 array([ 105.61643877,  442.76668729,  450.37335607]),                                           
 array([ 348.84179544]),                                                                         
 array([ 295.05603151,  451.77083268,  500.81771919]),                                           
 array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919]),                            
 array([  91.86758237,  148.70156948,  488.70648486,  507.31389766]),                            
 array([ 353.68691095]),                                                                         
 array([ 208.21919198,  246.57665959,  251.33820305,  394.34266882])]    

时间比较:

In [56]: %timeit  [x[x!=0] for x in LA if len(x) and len(x[x!=0])]                     
10000 loops, best of 3: 176 µs per loop                                                          

In [88]: %timeit [x[x!=0] for x in LA if count_nonzero(x)]                                   
10000 loops, best of 3: 89.7 µs per loop   

#@gnibbler's solution:

In [82]: %timeit [x.compress(x) for x in LA if x.any()]                                          
10000 loops, best of 3: 138 µs per loop  

更大阵列的计时结果:

In [140]: LA = [resize(x, 10**5) for x in LA]                                                    

In [142]: %timeit [x[x!=0] for x in LA if len(x) and len(x[x!=0])]                               
10 loops, best of 3: 26.7 ms per loop                                                            

In [143]: %timeit [x[x!=0] for x in LA if count_nonzero(x) > 0]                                  
10 loops, best of 3: 26 ms per loop                                                              

In [144]: %timeit [x.compress(x) for x in LA if x.any()]                                         
10 loops, best of 3: 42.7 ms per loop                                                            

In [145]: %timeit [x.compress(x) for x in LA if count_nonzero(x)]                                
10 loops, best of 3: 45.8 ms per loop                                                            

In [146]: %timeit [x[x!=0] for x in LA if x.any()]                                               
10 loops, best of 3: 22.9 ms per loop                                                            

In [147]: %timeit [x[x!=0] for x in LA if count_nonzero(x)]                                      
10 loops, best of 3: 26.2 ms per loop  

答案 1 :(得分:5)

列表理解应该做第一部分

[x for x in LA if x.any()]

您可以使用compress

执行第二部分
[x.compress(x) for x in LA if x.any()]

基于Ashwini的想法更快的版本

[x.compress(x) for x in LA if count_nonzero(x)]

<强>定时:

In [89]: %timeit [x.compress(x) for x in LA if count_nonzero(x)]  #clear winner                                
10000 loops, best of 3: 20.2 µs per loop