python数组中None和#之间的第一个和最后一个数字

时间:2016-12-18 22:46:58

标签: python arrays numpy

我有一个看起来像这样的numpy数组

[None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8 7 , 9, 12 , 74, None.......]

我总是需要第一个和最后一个值,结果应该是这样的

[8,2,169,23,111,74,...]

有没有人知道如何轻易地获得这些数字?

9 个答案:

答案 0 :(得分:2)

包含None和整数的NumPy数组无论如何都将是object类型。 似乎首先使用列表更容易:

res = []
for x1, x2 in zip(L[:-1], L[1:]):
    if (x1 is not None and x2 is None):
        res.append(x1)
    elif (x1 is None and x2 is not None): 
        res.append(x2)

res是:

[8, 2, 169, 23, 111, 74]

为避免在列表未以None开头或结尾时出现错误结果,请将搜索范围限制在第一个和最后一个None之间的部分:

res = []
start = L.index(None)
end = len(L) - L[::-1].index(None)

for x1, x2 in zip(L[start:end-1], L[start+1:end]):
    if (x1 is not None and x2 is None):
        res.append(x1)
    elif (x1 is None and x2 is not None): 
        res.append(x2)

如果您的NumPy数组使用NaN而不是None:

a = np.array([np.nan, np.nan, np.nan, np.nan, np.nan, 8, 7, 2, np.nan,
              np.nan , np.nan , np.nan, np.nan, np.nan, np.nan, 169, 37, 9,
              7 ,23, np.nan , np.nan , 111, 24, 8, 7 , 9, 12 , 74, np.nan])

您可以采用矢量化方式执行此操作:

b = a[np.isnan(np.roll(a, 1)) | np.isnan(np.roll(a, -1))]
res = b[~np.isnan(b)]

现在res看起来像这样:

array([   8.,    2.,  169.,   23.,  111.,   74.])

再一次,版本有限  在第一个和最后一个NaN之间搜索:

indices = np.arange(len(a))[np.isnan(a)]
short = a[indices[0]:indices[-1]]
b = short[np.isnan(np.roll(short, 1)) | np.isnan(np.roll(short, -1))]
res = b[~np.isnan(b)]

答案 1 :(得分:1)

使用pandas软件包 - 感谢innisfree提及bug,如果系列没有以无开头/结尾:

import pandas
x=numpy.array([1,3,4,None, None, None, None, None, 8, 7, 2, None, None,7,8])
z = pandas.Series(numpy.append(numpy.insert(x,0,None),None))
res = z[z.shift(1).isnull() | z.shift(-1).isnull()].dropna()

答案 2 :(得分:1)

this answer偷了很多,你可以这样做:

None转换为nan

x = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8 7 , 9, 12 , 74, None]

x = np.array(x,dtype=np.float)

然后:

x = np.vstack([a[s].take([0,-1]) for s in np.ma.clump_unmasked(np.ma.masked_invalid(x))]).flatten() 

这会将您的数组划分为与非nan值的连续组相对应的数组。然后使用.take([0,-1])获取这些数组中的第一个和最后一个元素。然后它将这些数组堆叠成一个数组并展平它。

print(repr(x))

array([   8.,    2.,  169.,   23.,  111.,   74.])

答案 3 :(得分:1)

a = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8, 7 , 9, 12 , 74, None]

a.append(None)
[a[e] for e in range(len(a)-1) if a[e]!=None and (a[e-1]==None or a[e+1]==None)]

输出:

[8, 2, 169, 23, 111, 74]

答案 4 :(得分:1)

使用numpy的矢量化方式:

arr = np.asarray(arr)
# find all the nones - need the np.array to work around backwards-compatible misbehavior
is_none = arr == np.array(None)

# find which values are on an edge. Start and end get a free pass for being not none
is_edge = ~is_none
is_edge[1:-1] &= (is_none[2:] | is_none[:-2])

# use the boolean mask as an index
res = arr[is_edge]

is_edge可以更详细地计算,但可能更清楚地计算如下:

is_edge = ~is_none & (
    np.pad(is_none[1:], (0,1), mode='constant', constant_values=True)
    |
    np.pad(is_none[:-1], (1,0), mode='constant', constant_values=True)
)

答案 5 :(得分:0)

这里我将如何使用常规Python列表。我没有针对numpy数组的特定答案。

result = []
for ind, n in enumerate(lst):
    if n is not None and (lst[ind-1] is None or lst[ind+1] is None):
        result.append(n)

注意:如果索引0len - 1(最后一个元素)是数字,则会产生错误的结果。

答案 6 :(得分:0)

您可以使用列表理解:

ex = [None, None, None, None, None, 8, 7, 2, 
      None, None , None , None, None, None, None, 169, 37, 9 ,7 ,23,
      None , None , 111, 24, 87 , 9, 12 , 74, 
      None]

filt = [y for x, y, z in zip(ex, ex[1:], ex[2:]) 
        if y is not None and (x is None or z is None)]

# [8, 2, 169, 23, 111, 74]

这不需要外部依赖;但是,如果列表特别大,为我的压缩迭代器制作两个额外的副本可能会很昂贵。有可能用以下方法克服这个问题。 itertools。

请注意,如果原始列表包含前导或尾随数字,则上述操作可能会失败。首先剥离它们,例如

while ex[0] is not None:
    del ex[0]

while ex[-1] is not None:
    del ex[-1]

答案 7 :(得分:0)

zip的美妙之处在于你可以做到这一点"三次迭代"一种事情(一次考虑三个项目循环列表):

result = []
for previous, item, next in zip(x[1:], x, np.hstack((None, x))):
    if item and None in (previous, next):
        result.append(item)

其他答案也是合理的,这是对可读性/可理解性的尝试。

答案 8 :(得分:0)

它不完全是,它涉及来自另一个外部包的函数:iteration_utilities.split但是如果你在list上操作它可能会相对较快:

>>> lst = [None, None, None, None, None, 8, 7, 2, None, None , None , None, None, 
       None, None, 169, 37, 9 ,7 ,23, None , None , 111, 24, 8, 7 , 9, 12 , 74, None]

>>> from iteration_utilities import Iterable, is_None
>>> from operator import itemgetter
>>> Iterable(lst).split(is_None           # split by None
                ).filter(None             # remove empty lists
                ).map(itemgetter(0, -1)   # get the first and last element of each splited part
                ).flatten(                # flatten the result
                ).as_list()               # and convert it to a list
[8, 2, 169, 23, 111, 74]

请注意,您也可以在纯Python中执行此操作(

def first_last_between_None(iterable):
    last = None
    for item in iterable:
        # If we're in the None-part just proceed until we get a not None
        if last is None:
            if item is None:
                continue
            else:
                # Not None, reset last and yield the current value
                last = item
                yield item
        else:
            # If the next item is None we're at the end of the number-part
            # yield the last item.
            if item is None:
                yield last
            last = item
    if last is not None:
        yield last

>>> list(first_last_between_None(lst))
[8, 2, 169, 23, 111, 74]

如果您希望在未使用None开始或结束时丢弃第一个和最后一个值,只需选择适当的切片:

if lst[0] is not None:
    res = res[2:]
if lst[-1] is not None:
    res = res[:-2]