我有一个看起来像这样的熊猫系列
>>> print(x)
0 1
1 2
2 3
3 4
4 0
5 0
6 0
7 0
8 9
9 6
10 3
11 5
12 7
Name: c, dtype: int64
我想从每组不为零的数字中找到最小值,我可能没有解释这么大,所以我希望输出看起来像这样
>>> print(result)
0 1
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 3
9 3
10 3
11 3
12 3
Name: c, dtype: int64
答案 0 :(得分:3)
使用shift
ing cumsum
技巧,然后调用GroupBy.transform
:
u = x.eq(0)
x.groupby(u.ne(u.shift()).cumsum()).transform('min')
0 1
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 3
9 3
10 3
11 3
12 3
Name: 1, dtype: int64
答案 1 :(得分:3)
for
和Numba 我想使用for
循环,但可以通过Numba加快循环速度
for
循环,不是很漂亮import pandas as pd
import numpy as np
from numba import njit
@njit
def f(x):
y = []
z = []
for a in x:
if not y:
y.append(a)
z.append(0)
else:
if (y[-1] == 0) ^ (a == 0):
y.append(a)
z.append(z[-1] + 1)
else:
y[-1] = min(y[-1], a)
z.append(z[-1])
return np.array(y)[np.array(z)]
pd.Series(f(x.to_numpy()), x.index)
0 1
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 3
9 3
10 3
11 3
12 3
dtype: int64
itertools.groupby
Credit to room 6 for the assist.
from itertools import groupby, repeat
def repeat_min(x):
for _, group in groupby(x, key=bool):
group = list(group)
minval = min(group)
yield from repeat(minval, len(group))
pd.Series([*repeat_min(x)], x.index)
0 1
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 3
9 3
10 3
11 3
12 3
dtype: int64