这是我拥有的数据集。以下项目每天记录一次。
香烟,烟草,零食/杂货,饮料,牛奶,咖啡,Solaray,调理食品,国际食品,汽车/新闻纸,彩票-从头开始,彩票-机器,Whl销售/礼品卡在每个日期重复。 / p>
我想将此框架转换为涵盖相同数据的框架,将重复的部门作为列,将Date作为索引,将Sales作为值。 我试过使用pivot_table,但我意识到它会更改值和组合。 这就是我的想法,但是返回了意外的结果...
dept = dept.pivot_table(values='Sales', index = dept.index, columns='Dept', aggfunc='first')
,这是我要更改的原始数据框。
Date Dept Sales
2018-12-01 Cigarettes 426.889
2018-12-01 Tobacco 43.84
2018-12-01 Snack/Grocery 198.57
2018-12-01 Beverages 160.97
2018-12-01 Milk 11.56
2018-12-01 Coffee 29.72
2018-12-01 Solaray 9.99
2018-12-01 Prepared Foods 3.99
2018-12-01 International Food 65
2018-12-01 Sweets 0
2018-12-01 Automotive/News Paper 10.47
2018-12-01 Lottery - Scratch 1397
2018-12-01 Lottery - Machine 191
2018-12-01 Whl-Sales/Gift-Card 0
2018-12-01 Total 2549
2018-12-02 Cigarettes 374.01
2018-12-02 Tobacco 89.29
2018-12-02 Snack/Grocery 178.01
2018-12-02 Beverages 135.28
2018-12-02 Milk 9.57
2018-12-02 Coffee 33.76
2018-12-02 Solaray 17.99
2018-12-02 Prepared Foods 20.98
2018-12-02 International Food 3.98
2018-12-02 Sweets 0
2018-12-02 Automotive/News Paper 13.16
2018-12-02 Lottery - Scratch 651
2018-12-02 Lottery - Machine 211
2018-12-02 Whl-Sales/Gift-Card 0
2018-12-02 Total 1738.03
2018-12-03 Cigarettes 463.54
2018-12-03 Tobacco 35.26
2018-12-03 Snack/Grocery 164.19
2018-12-03 Beverages 126.01
2018-12-03 Milk 8.57
2018-12-03 Coffee 30.47
2018-12-03 Solaray 17.99
2018-12-03 Prepared Foods 0
2018-12-03 International Food 21.98
2018-12-03 Sweets 0
2018-12-03 Automotive/News Paper 70.17
2018-12-03 Lottery - Scratch 1046
2018-12-03 Lottery - Machine 461
2018-12-03 Whl-Sales/Gift-Card 0
2018-12-03 Total 2445.18
2018-12-03 Cigarettes 463.54
2018-12-03 Tobacco 35.26
2018-12-03 Snack/Grocery 164.19
2018-12-03 Beverages 126.01
2018-12-03 Milk 8.57
2018-12-03 Coffee 30.47
2018-12-03 Solaray 17.99
2018-12-03 Prepared Foods 0
2018-12-03 International Food 21.98
2018-12-03 Sweets 0
2018-12-03 Automotive/News Paper 70.17
2018-12-03 Lottery - Scratch 1046
2018-12-03 Lottery - Machine 461
2018-12-03 Whl-Sales/Gift-Card 0
2018-12-03 Total 2445.18
2018-12-04 Cigarettes 291.91
2018-12-04 Tobacco 42.93
2018-12-04 Snack/Grocery 207.87
2018-12-04 Beverages 163.11
2018-12-04 Milk 3.99
2018-12-04 Coffee 32.17
2018-12-04 Solaray 40.98
2018-12-04 Prepared Foods 5
2018-12-04 International Food 6.98
2018-12-04 Sweets 0
2018-12-04 Automotive/News Paper 47
2018-12-04 Lottery - Scratch 762
2018-12-04 Lottery - Machine 112.75
2018-12-04 Whl-Sales/Gift-Card NaN
2018-12-04 Total 1716.69
2018-12-05 Cigarettes 255.72
2018-12-05 Tobacco 81.52
2018-12-05 Snack/Grocery 212.94
2018-12-05 Beverages 87.94
2018-12-05 Milk 9.77
2018-12-05 Coffee 15.95
2018-12-05 Solaray 11.98
2018-12-05 Prepared Foods 8.98
2018-12-05 International Food 17.73
2018-12-05 Sweets 0
2018-12-05 Automotive/News Paper 46.24
2018-12-05 Lottery - Scratch 540
2018-12-05 Lottery - Machine 151
2018-12-05 Whl-Sales/Gift-Card NaN
2018-12-05 Total 1439.77
2018-12-06 Cigarettes 377.96
2018-12-06 Tobacco 129.07
2018-12-06 Snack/Grocery 281.83
2018-12-06 Beverages 235.73
2018-12-06 Milk 0
2018-12-06 Coffee 29.32
2018-12-06 Solaray 12.99
2018-12-06 Prepared Foods 27.37
2018-12-06 International Food 9.99
2018-12-06 Sweets 5
2018-12-06 Automotive/News Paper 32.92
2018-12-06 Lottery - Scratch 509
2018-12-06 Lottery - Machine 194
2018-12-06 Whl-Sales/Gift-Card NaN
2018-12-06 Total 1845.18
2018-12-07 Cigarettes 526.91
2018-12-07 Tobacco 65.71
2018-12-07 Snack/Grocery 202.27
2018-12-07 Beverages 183.59
2018-12-07 Milk 2.79
2018-12-07 Coffee 16.22
2018-12-07 Solaray 5.99
2018-12-07 Prepared Foods 24.98
2018-12-07 International Food 1.99
2018-12-07 Sweets 0
2018-12-07 Automotive/News Paper 31.06
2018-12-07 Lottery - Scratch 300
2018-12-07 Lottery - Machine 61.5
2018-12-07 Whl-Sales/Gift-Card 0
2018-12-07 Total 1423.01
答案 0 :(得分:1)
一种方法是将索引设置为['Date', 'Dept']
和unstack()
,但是对于日期Dept
,每个2018-12-03
都有多个值。
请注意是否可以预期,但是解决该问题的一种方法是让groupby().first()
取第一个值,然后取unstack()
,例如:
In []:
df.set_index(['Date', 'Dept']).groupby(level=[0, 1]).first().unstack()
Out []:
Sales
Dept Automotive/News Paper Beverages Cigarettes Coffee International Food Lottery - Machine Lottery - Scratch Milk Prepared Foods Snack/Grocery Solaray Sweets Tobacco Total Whl-Sales/Gift-Card
Date
2018-12-01 10.47 160.97 426.889 29.72 65.00 191.00 1397.0 11.56 3.99 198.57 9.99 0.0 43.84 2549.00 0.0
2018-12-02 13.16 135.28 374.010 33.76 3.98 211.00 651.0 9.57 20.98 178.01 17.99 0.0 89.29 1738.03 0.0
2018-12-03 70.17 126.01 463.540 30.47 21.98 461.00 1046.0 8.57 0.00 164.19 17.99 0.0 35.26 2445.18 0.0
2018-12-04 47.00 163.11 291.910 32.17 6.98 112.75 762.0 3.99 5.00 207.87 40.98 0.0 42.93 1716.69 NaN
2018-12-05 46.24 87.94 255.720 15.95 17.73 151.00 540.0 9.77 8.98 212.94 11.98 0.0 81.52 1439.77 NaN
2018-12-06 32.92 235.73 377.960 29.32 9.99 194.00 509.0 0.00 27.37 281.83 12.99 5.0 129.07 1845.18 NaN
2018-12-07 31.06 183.59 526.910 16.22 1.99 61.50 300.0 2.79 24.98 202.27 5.99 0.0 65.71 1423.01 0.0
但这与df.pivot_table(index='Date', columns='Dept', values='Sales')
几乎相同:
Dept Automotive/News Paper Beverages Cigarettes Coffee International Food Lottery - Machine Lottery - Scratch Milk Prepared Foods Snack/Grocery Solaray Sweets Tobacco Total Whl-Sales/Gift-Card
Date
2018-12-01 10.47 160.97 426.889 29.72 65.00 191.00 1397.0 11.56 3.99 198.57 9.99 0.0 43.84 2549.00 0.0
2018-12-02 13.16 135.28 374.010 33.76 3.98 211.00 651.0 9.57 20.98 178.01 17.99 0.0 89.29 1738.03 0.0
2018-12-03 70.17 126.01 463.540 30.47 21.98 461.00 1046.0 8.57 0.00 164.19 17.99 0.0 35.26 2445.18 0.0
2018-12-04 47.00 163.11 291.910 32.17 6.98 112.75 762.0 3.99 5.00 207.87 40.98 0.0 42.93 1716.69 NaN
2018-12-05 46.24 87.94 255.720 15.95 17.73 151.00 540.0 9.77 8.98 212.94 11.98 0.0 81.52 1439.77 NaN
2018-12-06 32.92 235.73 377.960 29.32 9.99 194.00 509.0 0.00 27.37 281.83 12.99 5.0 129.07 1845.18 NaN
2018-12-07 31.06 183.59 526.910 16.22 1.99 61.50 300.0 2.79 24.98 202.27 5.99 0.0 65.71 1423.01 0.0