我正在考虑一些Python数据,这些数据是表格中的数组列表:
LA=
[array([ 99.08322813, 253.42371683, 300.792029 ])
array([ 51.55274095, 106.29707418, 0])
array([0, 0 ,0 , 0, 0])
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493, 453.56783459])
array([ 105.61643877, 442.76668729, 450.37335607])
array([ 348.84179544])
array([], dtype=float64)]
array([0, 0 , 0])
array([ 295.05603151, 0, 451.77083268, 500.81771919])
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919])
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766])
array([ 353.68691095])
array([ 208.21919198, 246.57665959, 0, 251.33820305, 394.34266882])
array([], dtype=float64)]
在我的数据中,我得到了一些 空数组:
array([], dtype=float64)]
和填充零的数组:
array([0, 0, 0])
如何以自动化的简单方式摆脱这两种阵列
LA=
[array([ 99.08322813, 253.42371683, 300.792029 ])
array([ 51.55274095, 106.29707418, 0])
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493, 453.56783459])
array([ 105.61643877, 442.76668729, 450.37335607])
array([ 348.84179544])
array([ 295.05603151, 0, 451.77083268, 500.81771919])
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919])
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766])
array([ 353.68691095])
array([ 208.21919198, 246.57665959, 0, 251.33820305, 394.34266882])
最后,我想删除零,同时保持数组列表格式
LA=
[array([ 99.08322813, 253.42371683, 300.792029 ])
array([ 51.55274095, 106.29707418])
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493, 453.56783459])
array([ 105.61643877, 442.76668729, 450.37335607])
array([ 348.84179544])
array([ 295.05603151, 451.77083268, 500.81771919])
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919])
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766])
array([ 353.68691095])
array([ 208.21919198, 246.57665959, 251.33820305, 394.34266882])
提前致谢
答案 0 :(得分:5)
使用NumPy和列表理解:
>>> from numpy import *
解决方案1:
>>> [x[x!=0] for x in LA if len(x) and len(x[x!=0])]
[array([ 99.08322813, 253.42371683, 300.792029 ]),
array([ 51.55274095, 106.29707418]),
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493,
453.56783459]),
array([ 105.61643877, 442.76668729, 450.37335607]),
array([ 348.84179544]),
array([ 295.05603151, 451.77083268, 500.81771919]),
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919]),
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766]),
array([ 353.68691095]),
array([ 208.21919198, 246.57665959, 251.33820305, 394.34266882])]
解决方案2:
>>> [x[x!=0] for x in LA if count_nonzero(x)]
[array([ 99.08322813, 253.42371683, 300.792029 ]),
array([ 51.55274095, 106.29707418]),
array([ 149.07283952, 191.45513754, 251.19610503, 393.50806493,
453.56783459]),
array([ 105.61643877, 442.76668729, 450.37335607]),
array([ 348.84179544]),
array([ 295.05603151, 451.77083268, 500.81771919]),
array([ 295.05603151, 307.37232315, 451.77083268, 500.81771919]),
array([ 91.86758237, 148.70156948, 488.70648486, 507.31389766]),
array([ 353.68691095]),
array([ 208.21919198, 246.57665959, 251.33820305, 394.34266882])]
时间比较:
In [56]: %timeit [x[x!=0] for x in LA if len(x) and len(x[x!=0])]
10000 loops, best of 3: 176 µs per loop
In [88]: %timeit [x[x!=0] for x in LA if count_nonzero(x)]
10000 loops, best of 3: 89.7 µs per loop
#@gnibbler's solution:
In [82]: %timeit [x.compress(x) for x in LA if x.any()]
10000 loops, best of 3: 138 µs per loop
更大阵列的计时结果:
In [140]: LA = [resize(x, 10**5) for x in LA]
In [142]: %timeit [x[x!=0] for x in LA if len(x) and len(x[x!=0])]
10 loops, best of 3: 26.7 ms per loop
In [143]: %timeit [x[x!=0] for x in LA if count_nonzero(x) > 0]
10 loops, best of 3: 26 ms per loop
In [144]: %timeit [x.compress(x) for x in LA if x.any()]
10 loops, best of 3: 42.7 ms per loop
In [145]: %timeit [x.compress(x) for x in LA if count_nonzero(x)]
10 loops, best of 3: 45.8 ms per loop
In [146]: %timeit [x[x!=0] for x in LA if x.any()]
10 loops, best of 3: 22.9 ms per loop
In [147]: %timeit [x[x!=0] for x in LA if count_nonzero(x)]
10 loops, best of 3: 26.2 ms per loop
答案 1 :(得分:5)
列表理解应该做第一部分
[x for x in LA if x.any()]
您可以使用compress
[x.compress(x) for x in LA if x.any()]
基于Ashwini的想法更快的版本
[x.compress(x) for x in LA if count_nonzero(x)]
<强>定时:强>
In [89]: %timeit [x.compress(x) for x in LA if count_nonzero(x)] #clear winner
10000 loops, best of 3: 20.2 µs per loop