我有一个时间序列数据集,如下所示:
Date Newspaper City1 City2 Region1Total City3 City4 Region2Total
2017-12-01 NewsPaper1 231563 8696 240259 21072 8998 30070
2017-12-01 NewsPaper2 173009 12180 185189 28910 5550 34460
2017-12-01 NewsPaper3 40511 4600 45111 5040 3330 8370
2017-12-01 NewsPaper4 37770 2980 40750 6520 1880 8400
2017-12-01 NewsPaper5 5176 900 6076 1790 5000 6790
2017-12-01 NewsPaper6 137650 8025 145675 25300 11000 36300
2017-12-01 Total 637547 38201 675748 91032 36558 127590
2018-01-01 NewsPaper1 231295 8391 239686 8790 21176 29966
2018-01-01 NewsPaper2 169937 12130 182067 7890 28850 36740
2018-01-01 NewsPaper3 40453 4570 45023 4750 5055 9800
2018-01-01 NewsPaper4 37766 2970 40736 2500 6540 9040
2018-01-01 NewsPaper5 5136 900 6036 5600 1795 7365
2018-01-01 NewsPaper6 137990 8010 146000 14500 25330 39830
2018-01-01 Total 633919 37786 671705 44980 91141 136121
我试图在此数据框的每一列中找到最大n个值。我尝试了以下方法
somelist = []
data = pd.read_excel('newspaper.csv')
data.index = pd.to_datetime(data['Date'], errors='coerce')
last_month = data.loc[data.index[-1]] # i am considering only the previous month(latest month in the dataframe)
last_month.set_index('Newspaper', inplace = True)
for city in last_month.iloc[:, 2: ]:
top_3 = last_month[city].nlargest(4)[1: ] #highest will be total but we should skip it
somelist.append(top_3)
print(somelist)
这产生的结果为pandas系列,下面列的名称为:
[Newspaper
Newspaper1 231295
Newspaper2 169937
Newspaper6 137990
Name: City1, dtype: float64, Newspaper
Newspaper2 12130.0
Newspaper1 8391.0
Newspaper6 8010.0
Name: City2, dtype: float64, Newspaper
Newspaper1 240259
Newspaper2 185189
Newspaper6 145675
Name: Region1Total, dtype: float64, Newspaper
Newspaper6 14500.0
Newspaper1 8790.0
Newspaper2 7890.0
Name: City3, dtype: float64, Newspaper
Newspaper2 28850.0
Newspaper6 25330.0
Newspaper1 21176.0
Name: City4, dtype: float64, Newspaper
Newspaper6 36300
Newspaper2 34460
Newspaper1 34460
Name: Region2Total, dtype: float64, Newspaper]
我想要的是每个城市和地区排名前三的报纸以及按降序排列的销售数字。我还希望在显示前3个结果之前打印城市/地区的名称。
预期输出是一个列表或类似下面的系列:
Newspaper City1
Newspaper1 231295
Newspaper2 169937
Newspaper6 137990
Newspaper City2
Newspaper2 12130.0
Newspaper1 8391.0
Newspaper6 8010.0
Newspaper Region1Total
Newspaper1 240259
Newspaper2 185189
Newspaper6 145675
Newspaper City3
Newspaper6 14500.0
Newspaper1 8790.0
Newspaper2 7890.0
Newspaper City4
Newspaper2 28850.0
Newspaper6 25330.0
Newspaper1 21176.0
Newspaper Region2Total
Newspaper6 36300
Newspaper2 34460
Newspaper1 34460
另外,如果我想跳过这些地区,只考虑一下这些城市,那么我该如何做呢? 任何帮助,将不胜感激。先感谢您。