我有一个数据框,我想在其中添加一个基于索引的值的新列。
这是我的假df:
{'fruit': [
'Apple', 'Kiwi', 'Clementine', 'Kiwi', 'Banana', 'Clementine', 'Apple', 'Kiwi'],
'bites': [1, 2, 3, 1, 2, 3, 1, 2]})
我发现了一个类似的问题,并在那里尝试了解决方案,但收到错误消息。这是我尝试过的:
conds = [(my.index >= 0) & (my.index <= row_2),
(my.index > row_2) & (my.index<=row_5),
(my.index > row_5) & (my.index<=row_6),
(my.index > row_6)]
names = ['Donna', 'Kelly', 'Andrea','Brenda']
my['names'] = np.select(conds, names)
答案 0 :(得分:2)
对我来说,它工作得很好(变量更改为数字),还添加了带有cut
和include_lowest=True
参数以匹配0
值并按DataFrame.loc
选择的替代解决方案:< / p>
conds = [(my.index >= 0) & (my.index <= 2),
(my.index > 2) & (my.index<=5),
(my.index > 5) & (my.index<=6),
(my.index > 6)]
names = ['Donna', 'Kelly', 'Andrea','Brenda']
my['names'] = np.select(conds, names)
my['names1'] = pd.cut(my.index, [0,2,5,6,np.inf], labels=names, include_lowest=True)
my.loc[:2, 'names2'] = 'Donna'
my.loc[3:5, 'names2'] = 'Kelly'
my.loc[6:7, 'names2'] = 'Andrea'
my.loc[7:, 'names2'] = 'Brenda'
print (my)
fruit bites names names1 names2
0 Apple 1 Donna Donna Donna
1 Kiwi 2 Donna Donna Donna
2 Clementine 3 Donna Donna Donna
3 Kiwi 1 Kelly Kelly Kelly
4 Banana 2 Kelly Kelly Kelly
5 Clementine 3 Kelly Kelly Kelly
6 Apple 1 Andrea Andrea Andrea
7 Kiwi 2 Brenda Brenda Brenda
答案 1 :(得分:2)
您可以尝试pd.cut
:
df['names'] = (pd.cut(df.index,
[0, 2, 5, 6, np.inf],
labels=names)
.fillna(names[0])
)