Pandas group-by条件过滤

时间:2015-05-26 06:40:21

标签: python pandas group-by

我有一个DataFrame:

import pandas as pd

df = pd.DataFrame({'First': ['Sam', 'Greg', 'Steve', 'Sam',
                             'Jill', 'Bill', 'Nod', 'Mallory', 'Ping', 'Lamar'],
                   'Last': ['Stevens', 'Hamcunning', 'Strange', 'Stevens',
                            'Vargas', 'Simon', 'Purple', 'Green', 'Simon', 'Simon'],
                   'Address': ['112 Fake St',
                               '13 Crest St',
                               '14 Main St',
                               '112 Fake St',
                               '2 Morningwood',
                               '7 Cotton Dr',
                               '14 Main St',
                               '20 Main St',
                               '7 Cotton Dr',
                               '7 Cotton Dr'],
                   'Status': ['Infected', '', 'Infected', '', '', '', '','', '', 'Infected'],
                   })

我应用以下分组代码

df_index = df.groupby(['Address', 'Last']).filter(lambda x: (x['Status'] == 'Infected').any()).index
df.loc[df_index, 'Status'] = 'Infected'

而不是将所有内容标记为" Infected"如在分组代码中。有没有一种方法可以选择要更新的值,以便将它们标记为其他值?例如:

df2 = df.copy(deep=True)
df2['Status'] = ['Infected', '', 'Infected', 'Infected2', '', 'Infected2', '', '', 'Infected2', 'Infected']

1 个答案:

答案 0 :(得分:0)

我认为这会达到你想要的结果,但会有所不同:

def infect_new_people(group):
    if (group['Status'] == 'Infected').any():
        # Only affect people not already infected
        group.loc[group['Status'] != 'Infected', 'Status'] = 'Infected2'
    return group['Status']

# Need group_keys=False so that each group has the same index
#   as the original dataframe
df['Status'] = df.groupby(['Address', 'Last'], group_keys=False).apply(infect_new_people)

df
Out[36]: 
         Address    First        Last     Status
0    112 Fake St      Sam     Stevens   Infected
1    13 Crest St     Greg  Hamcunning           
2     14 Main St    Steve     Strange   Infected
3    112 Fake St      Sam     Stevens  Infected2
4  2 Morningwood     Jill      Vargas           
5    7 Cotton Dr     Bill       Simon  Infected2
6     14 Main St      Nod      Purple           
7     20 Main St  Mallory       Green           
8    7 Cotton Dr     Ping       Simon  Infected2
9    7 Cotton Dr    Lamar       Simon   Infected