我偶然发现了这个帖子:
Monthly Averages Using Daily Data Using Python Pandas
回答了我的大部分问题。剩下的部分是,我如何相应地将结果与原始数据对齐,如下所示:
1979 1 1 3
1979 1 2 3
1979 1 3 3
1979 1 4 3
1979 1 5 3
1979 1 6 3
1979 1 7 4
1979 1 8 5
1979 1 9 7
1979 1 10 8
1979 1 11 16
1979 1 12 16
1979 1 13 16
1979 1 14 18
1979 1 15 18
1979 1 16 18
1979 1 17 18
1979 1 18 20
1979 1 19 20
1979 1 20 20
1979 1 21 20
1979 1 22 20
1979 1 23 18
1979 1 24 18
1979 1 25 18
1979 1 26 18
1979 1 27 18
1979 1 28 18
1979 1 29 18
1979 1 30 18
1979 1 31 19 13.709677
1979 2 1 19
1979 2 2 19
1979 2 3 19
1979 2 4 19
1979 2 5 19
1979 2 6 22
1979 2 7 24
1979 2 8 27
1979 2 9 29
1979 2 10 32
1979 2 11 32
1979 2 12 32
1979 2 13 32
1979 2 14 33
1979 2 15 33
1979 2 16 33
1979 2 17 34
1979 2 18 36
1979 2 19 36
1979 2 20 36
1979 2 21 36
1979 2 22 36
1979 2 23 36
1979 2 24 31
1979 2 25 29
1979 2 26 27
1979 2 27 27
1979 2 28 27 29.107143
提前致谢
答案 0 :(得分:0)
您可以使用dfGroupBy.transform
后跟df.duplicated
来标记和替换重复项。
In [198]: df['monthly_avg'] = df.groupby(["year", "month"]).snow_depth.transform('mean')
In [205]: df.set_value(df['monthly_avg'].duplicated('last'), 'monthly_avg', '')
Out[205]:
year month day snow_depth monthly_avg
0 1979 1 1 3
1 1979 1 2 3
2 1979 1 3 3
3 1979 1 4 3
4 1979 1 5 3
5 1979 1 6 3
6 1979 1 7 4
7 1979 1 8 5
8 1979 1 9 7
9 1979 1 10 8
10 1979 1 11 16
11 1979 1 12 16
12 1979 1 13 16
13 1979 1 14 18
14 1979 1 15 18
15 1979 1 16 18
16 1979 1 17 18
17 1979 1 18 20
18 1979 1 19 20
19 1979 1 20 20
20 1979 1 21 20
21 1979 1 22 20
22 1979 1 23 18
23 1979 1 24 18
24 1979 1 25 18
25 1979 1 26 18
26 1979 1 27 18
27 1979 1 28 18
28 1979 1 29 18
29 1979 1 30 18
30 1979 1 31 19 13.7097
31 1979 2 1 19
32 1979 2 2 19
33 1979 2 3 19
34 1979 2 4 19
35 1979 2 5 19
36 1979 2 6 22
37 1979 2 7 24
38 1979 2 8 27
39 1979 2 9 29
40 1979 2 10 32
41 1979 2 11 32
42 1979 2 12 32
43 1979 2 13 32
44 1979 2 14 33
45 1979 2 15 33
46 1979 2 16 33
47 1979 2 17 34
48 1979 2 18 36
49 1979 2 19 36
50 1979 2 20 36
51 1979 2 21 36
52 1979 2 22 36
53 1979 2 23 36
54 1979 2 24 31
55 1979 2 25 29
56 1979 2 26 27
57 1979 2 27 27
58 1979 2 28 27 29.1071
答案 1 :(得分:-1)
我接受了原始帖子的答案并进行了调整。
# Read in your file as a pandas.DataFrame
# using 'any number of whitespace' as the seperator
df = pd.read_csv("snow.txt", sep='\s*', names=["year", "month", "day", "snow_depth"])
# Group data first by year, then by month
df['monthly_avg'] = df.groupby(["year", "month"])['snow_depth'].transform('mean')
Transform将根据groupby的组广播聚合数据的结果。所以你的df的每一行都有月平均值。