我希望在表格中每个单元格= 100之后得到一个包含值的表格。有没有一种有效的方法来完成这个?
现在:
Col1 Col2 Col3 Col4
1 89 100 92
2 100 14 88
3 75 18 100
4 34 56 63
要:
Col1 Col2 Col3 Col4
1 nan 100 nan
2 100 14 nan
3 75 18 100
4 34 56 63
我试过了:
for row in data:
empty.append(str(np.where(element == 100 for element in row)));
for i in empty:
#Not sure what to do next
答案 0 :(得分:2)
针对您的问题的矢量化方法:
>>> a = np.array([[89, 100, 92], [100, 14, 88],
... [75, 18, 100], [34, 56, 63]])
>>> first100 = np.argmax(a == 100, axis=0)
>>> first100
array([1, 0, 2], dtype=int64)
>>> mask = rows[:, None] < first100
>>> mask
array([[ True, False, True],
[False, False, True],
[False, False, False],
[False, False, False]], dtype=bool)
>>> out = a.astype(float)
>>> out[mask] = np.nan
>>> out
array([[ nan, 100., nan],
[ 100., 14., nan],
[ 75., 18., 100.],
[ 34., 56., 63.]])
答案 1 :(得分:1)
看起来您正在使用pandas
,但如果不这样做,您可能需要累积最大功能:
In [37]:
a
Out[37]:
array([[ 89, 100, 92],
[100, 14, 88],
[ 75, 18, 100],
[ 34, 56, 63]], dtype=int64)
In [38]:
def cummax(a):
result=[]
for i in range(len(a)):
if i==0:
result.append(a[0])
else:
result.append(max(a[:i+1]))
return np.array(result)
In [39]:
np.where(np.apply_along_axis(cummax, 0, a)>=100, a, np.nan)
Out[39]:
array([[ nan, 100., nan],
[ 100., 14., nan],
[ 75., 18., 100.],
[ 34., 56., 63.]])