我有一些数据代表许多不同网站的结果。我想找到我的结果的四分位数细分以及每个网站的最大和最小日期。
找到每一项都很容易:
#quartiles
q = df.groupby(['site_id', 'datum']).quantile([0.25,0.5,0.75])
#max and min vlaues
d_max = df.groupby(['site_id', 'datum']).max()
d_min = df.groupby(['site_id', 'datum']).min()
结果是多索引数据帧。如何将这些组合在一起以获取site_id和datum的每个组合的所有3个值?
一些示例数据:
from io import StringIO
import pandas as pd
TESTDATA=StringIO(u'''date site_id datum result
1968-01-10 RN004481 SWL 61.23
1977-06-07 RN004481 SWL 60.16
1979-12-12 RN004481 SWL 58.76
1971-04-24 RN004482 SWL 79.93
1971-09-29 RN004482 SWL 79.97
1995-09-19 RN004482 SWL 92.91
1996-02-08 RN004482 SWL 93.15
1964-10-29 RN00448411 SWL 67.87
1965-03-04 RN004687 SWL 74.90
1993-03-16 RN02528611 SWL 7.50
2011-10-24 RN029429 SWL 2.59
2011-11-05 RN029429 SWL 2.68
1992-06-24 RN004464 SWL 52.24
1986-08-11 RN004482 SWL 86.84
1998-01-29 RN004482 SWL 94.33
1966-11-24 RN004687 DTW 75.16
1978-08-30 RN004687 SWL 78.24
1983-02-22 RN004687 DTW 81.00
1984-07-24 RN004687 SWL 81.26
1993-07-07 RN004687 SWL 87.18
1994-04-08 RN004687 DTW 87.53
1994-08-11 RN004687 SWL 87.41
2001-01-10 RN004687 SWL 92.04
2010-11-15 RN004687 SWL 97.06
1964-10-01 RN004693 SWL 59.56
1965-06-03 RN004693 SWL 59.74
1967-05-19 RN004693 SWL 59.58
1967-06-23 RN004693 RSWL 59.61
1967-09-22 RN004693 RSWL 59.69
1970-12-16 RN004693 DTW 59.54
''')
df = pd.read_csv(TESTDATA, delim_whitespace=True)