Question

我有一个看起来像这样的numpy数组

[None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8 7 , 9, 12 , 74, None.......]

我总是需要第一个和最后一个值，结果应该是这样的

[8,2,169,23,111,74，...]

有没有人知道如何轻易地获得这些数字？

Answer 1

包含None和整数的NumPy数组无论如何都将是object类型。似乎首先使用列表更容易：

res = []
for x1, x2 in zip(L[:-1], L[1:]):
    if (x1 is not None and x2 is None):
        res.append(x1)
    elif (x1 is None and x2 is not None): 
        res.append(x2)

无res是：

[8, 2, 169, 23, 111, 74]

为避免在列表未以None开头或结尾时出现错误结果，请将搜索范围限制在第一个和最后一个None之间的部分：

res = []
start = L.index(None)
end = len(L) - L[::-1].index(None)

for x1, x2 in zip(L[start:end-1], L[start+1:end]):
    if (x1 is not None and x2 is None):
        res.append(x1)
    elif (x1 is None and x2 is not None): 
        res.append(x2)

如果您的NumPy数组使用NaN而不是None：

a = np.array([np.nan, np.nan, np.nan, np.nan, np.nan, 8, 7, 2, np.nan,
              np.nan , np.nan , np.nan, np.nan, np.nan, np.nan, 169, 37, 9,
              7 ,23, np.nan , np.nan , 111, 24, 8, 7 , 9, 12 , 74, np.nan])

您可以采用矢量化方式执行此操作：

b = a[np.isnan(np.roll(a, 1)) | np.isnan(np.roll(a, -1))]
res = b[~np.isnan(b)]

现在res看起来像这样：

array([   8.,    2.,  169.,   23.,  111.,   74.])

再一次，版本有限在第一个和最后一个NaN之间搜索：

indices = np.arange(len(a))[np.isnan(a)]
short = a[indices[0]:indices[-1]]
b = short[np.isnan(np.roll(short, 1)) | np.isnan(np.roll(short, -1))]
res = b[~np.isnan(b)]

Answer 2

使用pandas软件包 - 感谢innisfree提及bug，如果系列没有以无开头/结尾：

import pandas
x=numpy.array([1,3,4,None, None, None, None, None, 8, 7, 2, None, None,7,8])
z = pandas.Series(numpy.append(numpy.insert(x,0,None),None))
res = z[z.shift(1).isnull() | z.shift(-1).isnull()].dropna()

Answer 3

从this answer偷了很多，你可以这样做：

将None转换为nan：

x = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8 7 , 9, 12 , 74, None]

x = np.array(x,dtype=np.float)

然后：

x = np.vstack([a[s].take([0,-1]) for s in np.ma.clump_unmasked(np.ma.masked_invalid(x))]).flatten()

这会将您的数组划分为与非nan值的连续组相对应的数组。然后使用.take([0,-1])获取这些数组中的第一个和最后一个元素。然后它将这些数组堆叠成一个数组并展平它。

print(repr(x))

array([   8.,    2.,  169.,   23.,  111.,   74.])

Answer 4

a = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8, 7 , 9, 12 , 74, None]

a.append(None)
[a[e] for e in range(len(a)-1) if a[e]!=None and (a[e-1]==None or a[e+1]==None)]

输出：

[8, 2, 169, 23, 111, 74]

Answer 5

使用numpy的矢量化方式：

arr = np.asarray(arr)
# find all the nones - need the np.array to work around backwards-compatible misbehavior
is_none = arr == np.array(None)

# find which values are on an edge. Start and end get a free pass for being not none
is_edge = ~is_none
is_edge[1:-1] &= (is_none[2:] | is_none[:-2])

# use the boolean mask as an index
res = arr[is_edge]

is_edge可以更详细地计算，但可能更清楚地计算如下：

is_edge = ~is_none & (
    np.pad(is_none[1:], (0,1), mode='constant', constant_values=True)
    |
    np.pad(is_none[:-1], (1,0), mode='constant', constant_values=True)
)

Answer 6

这里我将如何使用常规Python列表。我没有针对numpy数组的特定答案。

result = []
for ind, n in enumerate(lst):
    if n is not None and (lst[ind-1] is None or lst[ind+1] is None):
        result.append(n)

注意：如果索引0或len - 1（最后一个元素）是数字，则会产生错误的结果。

Answer 7

您可以使用列表理解：

ex = [None, None, None, None, None, 8, 7, 2, 
      None, None , None , None, None, None, None, 169, 37, 9 ,7 ,23,
      None , None , 111, 24, 87 , 9, 12 , 74, 
      None]

filt = [y for x, y, z in zip(ex, ex[1:], ex[2:]) 
        if y is not None and (x is None or z is None)]

# [8, 2, 169, 23, 111, 74]

这不需要外部依赖;但是，如果列表特别大，为我的压缩迭代器制作两个额外的副本可能会很昂贵。有可能用以下方法克服这个问题。 itertools。

请注意，如果原始列表包含前导或尾随数字，则上述操作可能会失败。首先剥离它们，例如

while ex[0] is not None:
    del ex[0]

while ex[-1] is not None:
    del ex[-1]

Answer 8

zip的美妙之处在于你可以做到这一点＆＃34;三次迭代＆＃34;一种事情（一次考虑三个项目循环列表）：

result = []
for previous, item, next in zip(x[1:], x, np.hstack((None, x))):
    if item and None in (previous, next):
        result.append(item)

其他答案也是合理的，这是对可读性/可理解性的尝试。

Answer 9

它不完全是numpy，它涉及来自另一个外部包的函数：iteration_utilities.split但是如果你在list上操作它可能会相对较快：

>>> lst = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
       None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8, 7 , 9, 12 , 74, None]

>>> from iteration_utilities import Iterable, is_None
>>> from operator import itemgetter
>>> Iterable(lst).split(is_None           # split by None
                ).filter(None             # remove empty lists
                ).map(itemgetter(0, -1)   # get the first and last element of each splited part
                ).flatten(                # flatten the result
                ).as_list()               # and convert it to a list
[8, 2, 169, 23, 111, 74]

请注意，您也可以在纯Python中执行此操作（

）

def first_last_between_None(iterable):
    last = None
    for item in iterable:
        # If we're in the None-part just proceed until we get a not None
        if last is None:
            if item is None:
                continue
            else:
                # Not None, reset last and yield the current value
                last = item
                yield item
        else:
            # If the next item is None we're at the end of the number-part
            # yield the last item.
            if item is None:
                yield last
            last = item
    if last is not None:
        yield last

>>> list(first_last_between_None(lst))
[8, 2, 169, 23, 111, 74]

如果您希望在未使用None开始或结束时丢弃第一个和最后一个值，只需选择适当的切片：

if lst[0] is not None:
    res = res[2:]
if lst[-1] is not None:
    res = res[:-2]

python数组中None和＃之间的第一个和最后一个数字

9 个答案: