如何在numpy数组中填充具有相同值的唯一条目的开头和结尾?

时间:2011-10-12 19:21:47

标签: python numpy

我有一个numpy数组。让我们看看下面的例子

a = [255,1,255,255,1,255,255,255,2,2,255,255,255,2,2,3,255,255,255,3]

在上面的数组中,除了值255之外,还会考虑唯一条目。我们希望在每个唯一条目之间填充值。

结果看起来像

[255,1,1,1,1,255,255,255,2,2,2,2,2,2,2,3,3,3,3,3]    

可以很容易地完成。寻找pythonic方式。

非常感谢

4 个答案:

答案 0 :(得分:1)

我使用itertools模块中的groupby函数。

我还使用了here中的window函数。

from __future__ import print_function
from  itertools import tee, izip, groupby

a = [255,1,255,255,1,255,255,255,2,2,255,255,255,2,2,3,255,255,255,3]

def groupby2(iterable):
    '''Used to convert to the second iterable element of "groupby" result to list'''
    for i in groupby(iterable):
        yield (i[0],list(i[1]))


def window(iterable,n):
    els = tee(iterable,n)
    for i,el in enumerate(els):
        for _ in range(i):
            next(el, None)
    return izip(*els)

def compress(iterable):
    it = window(groupby2(iterable),3)
    #Creates the iterator which yield the elements in the following manner: (255, [255]), (1, [1]), (255, [255, 255])

    for ge in it:
        flag = False #Reset the flag
        print('\nWindow: {}'.format(ge))

        for value in ge[0][1]: #Yield all the values of the first element of the window
                print('A: {}'.format(value))
                yield value

        if ge[1][0]==255 and ge[0][0]==ge[2][0]: #The central element of the window has to be replaced
            flag = True #Flag for correct last window processing        

            for _ in ge[1][1]: #Replacing the central element of the window
                print('B: {}'.format(ge[0][0]))
                yield ge[0][0]

            next(it,None) #Skip 1 element of the 'it' (which will be advanced by 1 element by for-loop, giving 2 net advances).   

    #Processing the last 2 elements of the last window.
    if flag==False: #The central element of the last window hasn't been processed. Proccessing.
        for value in ge[1][1]:
            print('C: {}'.format(value))
            yield value
    for value in ge[2][1]: #The last element of the window.
        print('D: {}'.format(value))
        yield value


print('\nInput: {}'.format(a))
output = list(compress((a)))
print('Proram output: {}'.format(output))
print('Goal output  : {}'.format([255,1,1,1,1,255,255,255,2,2,2,2,2,2,2,3,3,3,3,3]))

代码带有调试消息。我会把它们住在这里,因为它们更容易理解它是如何工作的。如果您不需要它们,请删除它们。

输出结果为:

Input: [255, 1, 255, 255, 1, 255, 255, 255, 2, 2, 255, 255, 255, 2, 2, 3, 255, 255, 255, 3]

Window: ((255, [255]), (1, [1]), (255, [255, 255]))
A: 255

Window: ((1, [1]), (255, [255, 255]), (1, [1]))
A: 1
B: 1
B: 1

Window: ((1, [1]), (255, [255, 255, 255]), (2, [2, 2]))
A: 1

Window: ((255, [255, 255, 255]), (2, [2, 2]), (255, [255, 255, 255]))
A: 255
A: 255
A: 255

Window: ((2, [2, 2]), (255, [255, 255, 255]), (2, [2, 2]))
A: 2
A: 2
B: 2
B: 2
B: 2

Window: ((2, [2, 2]), (3, [3]), (255, [255, 255, 255]))
A: 2
A: 2

Window: ((3, [3]), (255, [255, 255, 255]), (3, [3]))
A: 3
B: 3
B: 3
B: 3
D: 3
Proram output: [255, 1, 1, 1, 1, 255, 255, 255, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3]
Goal output  : [255, 1, 1, 1, 1, 255, 255, 255, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3]

<强>更新 这是一个重新考虑的版本:

from __future__ import print_function
from  itertools import tee, izip, groupby

def groupby2(iterable):
    for i in groupby(iterable):
        yield (i[0],len(tuple(i[1])))


def window(iterable,n):
    els = tee(iterable,n)
    for i,el in enumerate(els):
        for _ in range(i):
            next(el, None)
    return izip(*els)


def subs(iterable):
    it = window(groupby2(iterable),3)
    for left, middle, right in it:
        yield [left[0]]*left[1]
        if middle[0]==255 and left[0]==right[0]:
            yield [left[0]]*middle[1]
            next(it,None)
    if not(middle[0]==255 and left[0]==right[0]):
        yield [middle[0]]*middle[1]
    yield [right[0]]*right[1]


def chained(iterable):
    for L in subs(iterable):
        for el in L:
            yield el


a = [255,1,255,255,1,255,255,255,2,2,255,255,255,2,2,3,255,255,255,3]        
print('\nInput: {}'.format(a))
output = list(chained((a)))
print('Proram output: {}'.format(output))
print('Goal output  : {}'.format([255,1,1,1,1,255,255,255,2,2,2,2,2,2,2,3,3,3,3,3]))

答案 1 :(得分:1)

不知道pythonic在这里意味着什么,但只是我的两分钱,

import numpy as np    

a = np.array([255,1,255,255,1,255,255,255,2,2,255,255,255,2,2,3,255,255,255,3])

# find the locations of the unique numbers
b = np.where(a != 255)[0]
# find out what the unique numbers are
u = a[b]

for i,v in zip(b, u):
    try:
        if (v == vlast): # found a sandwich
            if (i != ilast+1): # make sure it has something in between 
                a[ilast+1: i] = v
        else: # make current unique value as the beginning of next sandwich
            vlast, ilast = v, i
    except NameError:
        # initialize the first match
        vlast, ilast = v, i

print(a)

它给出了正确的答案:

[255   1   1   1   1 255 255 255   2   2   2   2   2   2   2   3   3   3   3   3]

答案 2 :(得分:0)

基于numpy的简短解决方案:

import numpy
a = numpy.array([255,1,255,255,1,255,255,255,2,2,255,255,255,2,2,3,255,255,255,3])

b = [(i, numpy.argmax(a == i), len(a) - numpy.argmax(a[::-1] == i)) for i in numpy.unique(a[a < 255])]

for i in b:
    a[i[1]:i[2]] = i[0]

其中b是由(unique value, start index, end index + 1)组成的元组列表。

答案 3 :(得分:0)

另一个解决方案是在枚举值列表中使用window function包含2个项目和ifilterfalse

from __future__ import print_function
from  itertools import tee, izip, ifilterfalse


def window(iterable,n):
    els = tee(iterable,n)
    for i,el in enumerate(els):
        for _ in range(i):
            next(el, None)
    return izip(*els)


def replace(iterable,placeholder=255):
    it = enumerate(iterable)

    def save_last(iterable):
        for i in iterable:
            yield i
        replace.last_index = i[0] #Save the last value
    it = save_last(it)

    it = ifilterfalse(lambda x: x[1]==placeholder, it)
    for i,(left,right) in enumerate(window(it,2)):
        if i==0:
            for j in range(left[0]):
                yield placeholder
        yield left[1]
        if right[0]>left[0]+1:
            if left[1]==right[1]:
                for _ in range(right[0]-left[0]-1):
                    yield left[1]
            else:
                for _ in range(right[0]-left[0]-1):
                    yield placeholder
    yield right[1]
    if right[0]<replace.last_index:
        for i in range(replace.last_index-right[0]):
            yield placeholder


a = [255,1,255,255,1,255,255,255,2,2,255,255,255,2,2,3,255,255,255,3,255,255]        
print('\nInput: {}'.format(a))
output = list(replace(a))
print('Proram output: {}'.format(output))
print('Goal output  : {}'.format([255,1,1,1,1,255,255,255,2,2,2,2,2,2,2,3,3,3,3,3,255,255]))

Here我解释它是如何运作的。