> df = pd.DataFrame({"A": ["2002-01-12","2002-01-12","2002-01-12","2002-01-13","2002-01-13","2002-01-13","2002-01-16","2002-01-16","2002-01-16"], "B": ["12:00:00", "13:00:00", "14:00:00","11:00:00", "12:00:00", "13:00:00", "10:00:00", "11:00:00", "12:00:00"], "C": [ 3,19, 15, 6, 1, 5, 3, 12, 8]})
A B C
0 2002-01-12 12:00:00 3
1 2002-01-12 13:00:00 19
2 2002-01-12 14:00:00 15
3 2002-01-13 11:00:00 6
4 2002-01-13 12:00:00 1
5 2002-01-13 13:00:00 5
6 2002-01-16 10:00:00 3
7 2002-01-16 11:00:00 12
8 2002-01-16 12:00:00 8
我想为每个df['D']
群组创建一个新的df['E']
和A
以及下一个条件:
df['D']
:在C
A
值(尊重B == 12:00:00
组)
df['E']
:取C
前一天的值{尊敬A
组。)输出应为:
A B C D E
0 2002-01-12 12:00:00 3 0 0
1 2002-01-12 13:00:00 19 0 0
2 2002-01-12 14:00:00 15 0 0
3 2002-01-13 11:00:00 6 3 12.3
4 2002-01-13 12:00:00 1 3 12.3
5 2002-01-13 13:00:00 5 3 12.3
6 2002-01-16 10:00:00 3 1 4.0
7 2002-01-16 11:00:00 12 1 4.0
8 2002-01-16 12:00:00 8 1 4.0
答案 0 :(得分:3)
您可以为每一天帮助Series
创建,前一天将shift
和map
添加到新列,最后将NaN
替换为fillna
:
a = df[df['B'].eq('12:00:00')].set_index('A')['C'].shift(1)
b = df.groupby('A')['C'].mean().shift(1)
df['D'] = df['A'].map(a)
df['E'] = df['A'].map(b)
df[['D','E']] = df[['D','E']].fillna(0)
print (df)
A B C D E
0 2002-01-12 12:00:00 3 0.0 0.000000
1 2002-01-12 13:00:00 19 0.0 0.000000
2 2002-01-12 14:00:00 15 0.0 0.000000
3 2002-01-13 11:00:00 6 3.0 12.333333
4 2002-01-13 12:00:00 1 3.0 12.333333
5 2002-01-13 13:00:00 5 3.0 12.333333
6 2002-01-16 10:00:00 3 1.0 4.000000
7 2002-01-16 11:00:00 12 1.0 4.000000
8 2002-01-16 12:00:00 8 1.0 4.000000
答案 1 :(得分:0)
我做了一个更强大的,但有效:
df['A'] = pd.to_datetime(df['A'])
df['D'] = df['A'].apply(lambda x: df[(df['A']==(x + pd.DateOffset(-1)))&(df['B']=='12:00:00')]['C'].mean()).fillna(0)
df['E'] = df['A'].apply(lambda x: df[df['A']==(x + pd.DateOffset(-1))]['C'].mean()).fillna(0)