我有一个看起来像这样的数据框。
Date Daily Risk Score
0 2020-06-26 6.0
1 2020-06-27 6.0
2 2020-06-28 6.0
3 2020-06-29 6.0
4 2020-06-30 6.0
5 2020-07-01 6.0
6 2020-07-02 6.0
7 2020-07-03 6.0
8 2020-07-04 6.0
9 2020-07-05 6.0
10 2020-07-06 6.0
11 2020-07-07 6.0
12 2020-07-08 6.0
13 2020-07-09 6.0
14 2020-06-26 6.0
15 2020-06-27 6.0
16 2020-06-28 6.0
17 2020-06-29 6.0
18 2020-06-30 6.0
19 2020-07-01 6.0
20 2020-07-02 6.0
21 2020-07-03 6.0
22 2020-07-04 6.0
23 2020-07-05 6.0
24 2020-07-06 6.0
25 2020-07-07 6.0
26 2020-07-08 6.0
27 2020-07-09 6.0
28 2020-06-26 1.0
29 2020-06-27 1.0
实际数据帧大约为5万个条目。然后,我想取每个日期的所有每日风险评分的平均值。然后,我想将这14个新平均值中的每一个存储在称为“均值”的新列中,其中有14个值对应于它们的计算日期。
我试图这样做:
df2['Date']= pd.to_datetime(df2['Date'])
dates=pd.date_range(today, (today+dt.timedelta()))
for i in dates:
df2=df2[df2['Date']==i]
df2['means']=df2['Daily Risk Score'].mean()
但这仅计算第一天的平均值,然后停止循环。我在做什么错了?
答案 0 :(得分:1)
您可以执行以下操作:
mean_df = df.groupby("Date").mean().reset_index()
mean_df.columns = ["Date", "ScoreMean"]
# Date means
#0 2020-06-26 4.333333
#1 2020-06-27 4.333333
#2 2020-06-28 6.000000
#3 2020-06-29 6.000000
#4 2020-06-30 6.000000
#5 2020-07-01 6.000000
#6 2020-07-02 6.000000
#7 2020-07-03 6.000000
#8 2020-07-04 6.000000
#9 2020-07-05 6.000000
#10 2020-07-06 6.000000
#11 2020-07-07 6.000000
#12 2020-07-08 6.000000
#13 2020-07-09 6.000000
result = pd.merge(df, mean_df, on="Date")
# Date DailyRiskScore means
#0 2020-06-26 6.0 4.333333
#1 2020-06-26 6.0 4.333333
#2 2020-06-26 1.0 4.333333
#3 2020-06-27 6.0 4.333333
#4 2020-06-27 6.0 4.333333
#5 2020-06-27 1.0 4.333333
#6 2020-06-28 6.0 6.000000
#7 2020-06-28 6.0 6.000000
#8 2020-06-29 6.0 6.000000
#9 2020-06-29 6.0 6.000000
#10 2020-06-30 6.0 6.000000
#11 2020-06-30 6.0 6.000000
#12 2020-07-01 6.0 6.000000
#13 2020-07-01 6.0 6.000000
#14 2020-07-02 6.0 6.000000
#15 2020-07-02 6.0 6.000000
#16 2020-07-03 6.0 6.000000
#17 2020-07-03 6.0 6.000000
#18 2020-07-04 6.0 6.000000
#19 2020-07-04 6.0 6.000000
#20 2020-07-05 6.0 6.000000
#21 2020-07-05 6.0 6.000000
#22 2020-07-06 6.0 6.000000
#23 2020-07-06 6.0 6.000000
#24 2020-07-07 6.0 6.000000
#25 2020-07-07 6.0 6.000000
#26 2020-07-08 6.0 6.000000
#27 2020-07-08 6.0 6.000000
#28 2020-07-09 6.0 6.000000