我有一个看起来像这样的numpy数组
[None, None, None, None, None, 8, 7, 2, None, None , None , None, None,
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8 7 , 9, 12 , 74, None.......]
我总是需要第一个和最后一个值,结果应该是这样的
[8,2,169,23,111,74,...]
有没有人知道如何轻易地获得这些数字?
答案 0 :(得分:2)
包含None
和整数的NumPy数组无论如何都将是object
类型。
似乎首先使用列表更容易:
res = []
for x1, x2 in zip(L[:-1], L[1:]):
if (x1 is not None and x2 is None):
res.append(x1)
elif (x1 is None and x2 is not None):
res.append(x2)
无res
是:
[8, 2, 169, 23, 111, 74]
为避免在列表未以None
开头或结尾时出现错误结果,请将搜索范围限制在第一个和最后一个None
之间的部分:
res = []
start = L.index(None)
end = len(L) - L[::-1].index(None)
for x1, x2 in zip(L[start:end-1], L[start+1:end]):
if (x1 is not None and x2 is None):
res.append(x1)
elif (x1 is None and x2 is not None):
res.append(x2)
如果您的NumPy数组使用NaN而不是None:
a = np.array([np.nan, np.nan, np.nan, np.nan, np.nan, 8, 7, 2, np.nan,
np.nan , np.nan , np.nan, np.nan, np.nan, np.nan, 169, 37, 9,
7 ,23, np.nan , np.nan , 111, 24, 8, 7 , 9, 12 , 74, np.nan])
您可以采用矢量化方式执行此操作:
b = a[np.isnan(np.roll(a, 1)) | np.isnan(np.roll(a, -1))]
res = b[~np.isnan(b)]
现在res
看起来像这样:
array([ 8., 2., 169., 23., 111., 74.])
再一次,版本有限 在第一个和最后一个NaN之间搜索:
indices = np.arange(len(a))[np.isnan(a)]
short = a[indices[0]:indices[-1]]
b = short[np.isnan(np.roll(short, 1)) | np.isnan(np.roll(short, -1))]
res = b[~np.isnan(b)]
答案 1 :(得分:1)
使用pandas软件包 - 感谢innisfree提及bug,如果系列没有以无开头/结尾:
import pandas
x=numpy.array([1,3,4,None, None, None, None, None, 8, 7, 2, None, None,7,8])
z = pandas.Series(numpy.append(numpy.insert(x,0,None),None))
res = z[z.shift(1).isnull() | z.shift(-1).isnull()].dropna()
答案 2 :(得分:1)
从this answer偷了很多,你可以这样做:
将None
转换为nan
:
x = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None,
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8 7 , 9, 12 , 74, None]
x = np.array(x,dtype=np.float)
然后:
x = np.vstack([a[s].take([0,-1]) for s in np.ma.clump_unmasked(np.ma.masked_invalid(x))]).flatten()
这会将您的数组划分为与非nan
值的连续组相对应的数组。然后使用.take([0,-1])
获取这些数组中的第一个和最后一个元素。然后它将这些数组堆叠成一个数组并展平它。
print(repr(x))
array([ 8., 2., 169., 23., 111., 74.])
答案 3 :(得分:1)
a = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None,
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8, 7 , 9, 12 , 74, None]
a.append(None)
[a[e] for e in range(len(a)-1) if a[e]!=None and (a[e-1]==None or a[e+1]==None)]
输出:
[8, 2, 169, 23, 111, 74]
答案 4 :(得分:1)
使用numpy
的矢量化方式:
arr = np.asarray(arr)
# find all the nones - need the np.array to work around backwards-compatible misbehavior
is_none = arr == np.array(None)
# find which values are on an edge. Start and end get a free pass for being not none
is_edge = ~is_none
is_edge[1:-1] &= (is_none[2:] | is_none[:-2])
# use the boolean mask as an index
res = arr[is_edge]
is_edge
可以更详细地计算,但可能更清楚地计算如下:
is_edge = ~is_none & (
np.pad(is_none[1:], (0,1), mode='constant', constant_values=True)
|
np.pad(is_none[:-1], (1,0), mode='constant', constant_values=True)
)
答案 5 :(得分:0)
这里我将如何使用常规Python列表。我没有针对numpy数组的特定答案。
result = []
for ind, n in enumerate(lst):
if n is not None and (lst[ind-1] is None or lst[ind+1] is None):
result.append(n)
注意:如果索引0
或len - 1
(最后一个元素)是数字,则会产生错误的结果。
答案 6 :(得分:0)
您可以使用列表理解:
ex = [None, None, None, None, None, 8, 7, 2,
None, None , None , None, None, None, None, 169, 37, 9 ,7 ,23,
None , None , 111, 24, 87 , 9, 12 , 74,
None]
filt = [y for x, y, z in zip(ex, ex[1:], ex[2:])
if y is not None and (x is None or z is None)]
# [8, 2, 169, 23, 111, 74]
这不需要外部依赖;但是,如果列表特别大,为我的压缩迭代器制作两个额外的副本可能会很昂贵。有可能用以下方法克服这个问题。 itertools。
请注意,如果原始列表包含前导或尾随数字,则上述操作可能会失败。首先剥离它们,例如
while ex[0] is not None:
del ex[0]
while ex[-1] is not None:
del ex[-1]
答案 7 :(得分:0)
zip
的美妙之处在于你可以做到这一点"三次迭代"一种事情(一次考虑三个项目循环列表):
result = []
for previous, item, next in zip(x[1:], x, np.hstack((None, x))):
if item and None in (previous, next):
result.append(item)
其他答案也是合理的,这是对可读性/可理解性的尝试。
答案 8 :(得分:0)
它不完全是numpy,它涉及来自另一个外部包的函数:iteration_utilities.split
但是如果你在list
上操作它可能会相对较快:
>>> lst = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None,
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8, 7 , 9, 12 , 74, None]
>>> from iteration_utilities import Iterable, is_None
>>> from operator import itemgetter
>>> Iterable(lst).split(is_None # split by None
).filter(None # remove empty lists
).map(itemgetter(0, -1) # get the first and last element of each splited part
).flatten( # flatten the result
).as_list() # and convert it to a list
[8, 2, 169, 23, 111, 74]
请注意,您也可以在纯Python中执行此操作(
)def first_last_between_None(iterable):
last = None
for item in iterable:
# If we're in the None-part just proceed until we get a not None
if last is None:
if item is None:
continue
else:
# Not None, reset last and yield the current value
last = item
yield item
else:
# If the next item is None we're at the end of the number-part
# yield the last item.
if item is None:
yield last
last = item
if last is not None:
yield last
>>> list(first_last_between_None(lst))
[8, 2, 169, 23, 111, 74]
如果您希望在未使用None
开始或结束时丢弃第一个和最后一个值,只需选择适当的切片:
if lst[0] is not None:
res = res[2:]
if lst[-1] is not None:
res = res[:-2]