我正在尝试用NaN代替我的专栏
group_choices = ['Group1', 'Group2', 'Group3']
Groups limit
1 NaN NaN
2 Group1 2
3 Group2 2
4 Group3 2
5 NaN NaN
6 NaN NaN
7 NaN NaN
如何根据group_choises随机替换NaN?
由于限制列中的限制值,我还试图限制可以随机选择group_choise的频率。
我正在尝试获得以下结果:
Groups limit
1 Group3 NaN
2 Group1 2
3 Group2 2
4 Group3 2
5 Group1 NaN
6 Group2 NaN
7 Out of groups
答案 0 :(得分:2)
fillna
和字典dct = dict(zip(df.Groups.loc[pd.isna].index, group_choices))
df.fillna({'Groups': dct}).fillna({'Groups': 'Out of groups'})
Groups limit
1 Group1 NaN
2 Group1 2.0
3 Group2 2.0
4 Group3 2.0
5 Group2 NaN
6 Group3 NaN
7 Out of groups NaN
有用,但我更喜欢新的。它说明了我思考过程的演变。
def get_some(i, n):
for x in [*i] * n:
yield x
def fill(s, i, n):
gs = get_some(i, n)
for x in s:
if pd.isnull(x):
try:
yield next(gs)
except StopIteration:
yield "Out of groups"
else:
yield x
df.assign(Groups=[*fill(df.Groups, group_choices, 1)])
Groups limit
1 Group1 NaN
2 Group1 2.0
3 Group2 2.0
4 Group3 2.0
5 Group2 NaN
6 Group3 NaN
7 Out of groups NaN
替代
def get_some(i, n):
for x in [*i] * n:
yield x
df.assign(Groups=df.Groups.fillna(
df.Groups.loc[pd.isna].pipe(
lambda s: pd.Series(dict(zip(s.index, get_some(group_choices, 1))))
)
).fillna('Out of groups'))