Question

每个小组end_date的第一行都有begin_date，我正在尝试识别群组中的end_date等于type的所有行第一排我需要为匹配日期的行返回ID color begin_date end_date type 1 red 2017-01-01 2017-01-07 Professional 1 green 2017-01-05 2017-01-07 Aquatic 1 blue 2017-01-07 2017-01-15 Superhero 1 red 2017-01-11 2017-01-22 Chocolate 2 red 2017-02-22 2017-02-26 Professional 2 green 2017-02-26 2017-02-28 Aquatic 2 blue 2017-02-26 2017-02-28 Superhero 2 red 2017-02-27 2017-02-28 Chocolate 3 red 2017-03-11 2017-03-22 Chocolate if df.groupby('ID')['begin_date'].first() == df.groupby('ID')['end_date'].any(): return df.groupby('ID')['end_date'].any().to_dict() else: return 'non-existent'。如果有多个匹配，则第一个就足够了。如果没有匹配，则返回“不存在”＆＃39;。

DF

ID    type     
1     Superhero
2     Aquatic
3     non-existant

最终df

  Time <- c("2000-01-01 00:53:00","2000-01-01 06:53:00","2000-01-01 10:53:00")
  Time <- as.POSIXct(Time)
  Temp <- c(20,30,10)
  Temperature <- data.frame(Time,Temp)
  Temperature
                 Time Temp
1 2000-01-01 00:53:00   20
2 2000-01-01 06:53:00   30
3 2000-01-01 10:53:00   10

Answer 1

IIUC

df.groupby('ID').apply(lambda x :  df.loc[x['begin_date'].isin(x['end_date'].iloc[[0]]).idxmax(),'type'] if x['begin_date'].isin(x['end_date'].iloc[[0]]).any() else 'non-existent')
Out[23]: 
ID
1       Superhero
2         Aquatic
3    non-existent
dtype: object

Answer 2

这是另一种方法，使用groupby()，nth()和reindex()：

df.groupby('ID').apply(lambda x: x.loc[x.begin_date.eq(x.end_date.iloc[0]), 'type']).groupby('ID').nth(0).reindex(df['ID'].unique(),fill_value='non existant')

ID
1       Superhero
2         Aquatic
3    non existant

如何比较组中的项与pandas返回布尔值？

2 个答案: