计算每个群集的时间序列数据的季节性和趋势

时间:2018-07-31 20:30:09

标签: python pandas group-by time-series pivot-table

我有这个时间序列数据,现在我想使用' modal_price '为APMC和商品的每个群集计算趋势季节性类型(乘性或加性)。数据集大约有60,000个这样的行,其中APMC和Cluster相同,但日期更改。数据集如下:

             APMC |   Commodity  | qtl _weight| min_price | max_price | modal_price | district_name | Year | Month
date
2014-12-01  Akole   bajri            40              1375        1750      1563          Ahmadnagar  2014   12
2014-12-01  Akole   paddy-unhusked   346             1400        1800      1625          Ahmadnagar  2014   12
2014-12-01  Akole   wheat            55              1500        1900       1675         Ahmadnagar  2014   12
2014-12-01  Akole   bhagar/vari      59              2000        2600       2400         Ahmadnagar  2014   12
2014-12-01  Akole   gram              9              3200        3300       3235         Ahmadnagar  2014   12
2014-12-01  Jamkhed cotton           44199           3950        4033       3991         Ahmadnagar  2014   12
2014-12-01  Jamkhed bajri            846             1300        1488       1394         Ahmadnagar  2014   12
2014-12-01  Jamkhed wheat(husked)    155             1879        2231       2055         Ahmadnagar  2014   12
2014-12-01  Kopar   gram             421             1983        2698       2463         Ahmadnagar  2014   12
2014-12-01  Kopar   greengram         18             6734        7259       6759         Ahmadnagar  2014   12
2014-12-01  Kopar   soybean          1507            2945        3247       3199         Ahmadnagar  2014   12
2016-11-01  Sanga   wheat(husked)    222             1730        2173       1994         Ahmadnagar  2016   11

现在,我尝试使用(APMC,商品和日期作为索引)对此进行数据透视表设置,但是这无助于计算每个聚类(APMC,商品)的均值(以计算趋势)。我只需要知道如何使用'modal_price'和将其作为dataframe / pivot-table中的COLUMN列来计算每个聚类(APMC,Commodity)的均值。

1 个答案:

答案 0 :(得分:0)

也许groupby将为您提供趋势所需的信息,然后进行转换将使您能够将其投影回相同的索引? 像这样:

# group by your cluster
g = df.groupby(["Year", "APMC", "Commodity"])
# determine the trend per cluster but finalise back into original diimensions
trend = g.modal_price.transform(lambda x: x.mean())
df["trend"] = trend