根据条件逐行添加列值

时间:2019-08-24 11:59:30

标签: python python-3.x pandas dataframe

我有一个pandas数据帧和一些Score。现在,我想检查每个Name,如果Score有改善。

如果Score的{​​{1}}确实有所改善,我想写Name-否则写1。如果0之前没有Score可用,我想写Name

所以我的数据框看起来像这样:

NaN

结果应如下所示:

    import pandas as pd
    import numpy as np
    first = {
        'Date':['2013-02-28','2013-03-29','2013-05-29','2013-06-29','2013-02-27','2013-04-30','2013-01-20'],
        'Name':['Felix','Felix','Felix','Felix','Peter','Peter','Paul'],
        'Score':['10','12','13','11','14','14','9']}

df1 = pd.DataFrame(first)

我考虑过做类似的事情:

second = {
        'Date':['2013-02-28','2013-03-29','2013-05-29','2013-02-27','2013-04-30','2013-01-20'],
        'Name':['Felix','Felix','Felix','Peter','Peter','Paul'],
        'Score':['10','12','11','14','14','9'],
        'Improvement':['NaN','1','0','NaN','0','NaN']}

result = pd.DataFrame(second)

但是我在df1['Improvement'] = np.NaN col_idx = df1.columns.get_loc('Improvement') grouped = df1[df1['ID'].isin(['Felix', 'Peter','Paul'])].groupby(['ID']) for name, group in grouped: first = True for index, row in group.iterrows(): ... 列中实际上有100多个名字

1 个答案:

答案 0 :(得分:1)

这可能可以简化,但是您可以将其分解为一个groupby,以获取一个虚拟列,其中包含出现的名字分数的NaN值,然后对所需逻辑进行np.where

df['v'] = df.groupby(['Name'])['Score'].shift()
df['Score'] = pd.np.where(df['Score'] > df['v'], 1, 0)
df['Score'] = pd.np.where(df['v'].isna(), pd.np.nan, df['Score'])

print(df.iloc[:, :-1])

         Date   Name  Score   
0  2013-02-28  Felix    NaN  
1  2013-03-29  Felix    1.0   
2  2013-05-29  Felix    1.0   
3  2013-06-29  Felix    0.0   
4  2013-02-27  Peter    NaN  
5  2013-04-30  Peter    0.0   
6  2013-01-20   Paul    NaN