Python:在表的每列中检索索引后的值

时间:2014-02-21 02:39:06

标签: python numpy

我希望在表格中每个单元格= 100之后得到一个包含值的表格。有没有一种有效的方法来完成这个?

现在:

Col1 Col2 Col3 Col4
1    89   100  92
2    100  14   88
3    75   18   100
4    34   56   63

要:

Col1 Col2 Col3 Col4
1    nan  100  nan
2    100  14   nan
3    75   18   100
4    34   56   63

我试过了:

for row in data:
    empty.append(str(np.where(element == 100 for element in row)));
for i in empty:
    #Not sure what to do next

2 个答案:

答案 0 :(得分:2)

针对您的问题的矢量化方法:

>>> a = np.array([[89, 100, 92], [100, 14, 88],
...               [75, 18, 100], [34, 56, 63]])
>>> first100 = np.argmax(a == 100, axis=0)
>>> first100
array([1, 0, 2], dtype=int64)
>>> mask = rows[:, None] < first100
>>> mask
array([[ True, False,  True],
       [False, False,  True],
       [False, False, False],
       [False, False, False]], dtype=bool)
>>> out = a.astype(float)
>>> out[mask] = np.nan
>>> out
array([[  nan,  100.,   nan],
       [ 100.,   14.,   nan],
       [  75.,   18.,  100.],
       [  34.,   56.,   63.]])

答案 1 :(得分:1)

看起来您正在使用pandas,但如果不这样做,您可能需要累积最大功能:

In [37]:

a
Out[37]:
array([[ 89, 100,  92],
       [100,  14,  88],
       [ 75,  18, 100],
       [ 34,  56,  63]], dtype=int64)
In [38]:

def cummax(a):
    result=[]
    for i in range(len(a)):
        if i==0:
            result.append(a[0])
        else:
            result.append(max(a[:i+1]))
    return np.array(result)
In [39]:

np.where(np.apply_along_axis(cummax, 0, a)>=100, a, np.nan)
Out[39]:
array([[  nan,  100.,   nan],
       [ 100.,   14.,   nan],
       [  75.,   18.,  100.],
       [  34.,   56.,   63.]])