Question

我有一个常规的传入CSV，看起来像这样（简化）：

Published   Station         TypeFuel    Price
1/09/2015   BP Seaford      ULP         129.9
1/09/2015   BP Seaford      Diesel      133.9
1/09/2015   BP Seaford      Gas         156.9
1/09/2015   Shell Newhaven  ULP         139.9
1/09/2015   Shell Newhaven  Diesel      150.9
1/09/2015   7-Eleven Malaga ULP         135.9
1/09/2015   7-Eleven Malaga Diesel      155.9
2/10/2015   BP Seaford      ULP         138.9
2/10/2015   BP Seaford      Diesel      133.6
2/10/2015   BP Seaford      Gas         157.9

......隐藏了更多行。查看大约200个站点，每天报告20-30天。

我需要总结一下，看起来像这样：

Published   Station         ULP     Diesel  Gas
1/09/2015   BP Seaford      129.9   133.9   156.9
1/09/2015   Shell Newhaven  139.9   150.9   
1/09/2015   7-Eleven Malaga 135.9   155.9   
2/09/2015   BP Seaford      138.9   133.6   157.9

只是在Pandas教程中采取了一些步骤，也是Python的新手，但我相信这两个应该可以帮助我完成这项任务。

我相信我需要遍历CSV，当发布和站点匹配时，创建一个新行，将ULP /柴油/天然气价格转换为新列。

Answer 1

您正在寻找DataFrame.pivot_table()，根据列进行转化 - 'Published','Station'，从列 - TypeFuel获取值，用于数据透视表中的新列，并使用{{1作为它的价值观。示例 -

Price

如果您不希望In [5]: df Out[5]: Published Station TypeFuel Price 0 1/09/2015 BP Seaford ULP 129.9 1 1/09/2015 BP Seaford Diesel 133.9 2 1/09/2015 BP Seaford Gas 156.9 3 1/09/2015 Shell Newhaven ULP 139.9 4 1/09/2015 Shell Newhaven Diesel 150.9 5 1/09/2015 7-Eleven Malaga ULP 135.9 6 1/09/2015 7-Eleven Malaga Diesel 155.9 7 2/10/2015 BP Seaford ULP 138.9 8 2/10/2015 BP Seaford Diesel 133.6 9 2/10/2015 BP Seaford Gas 157.9 In [7]: df.pivot_table(index=['Published','Station'],columns=['TypeFuel'],values='Price') Out[7]: TypeFuel Diesel Gas ULP Published Station 1/09/2015 7-Eleven Malaga 155.9 NaN 135.9 BP Seaford 133.9 156.9 129.9 Shell Newhaven 150.9 NaN 139.9 2/10/2015 BP Seaford 133.6 157.9 138.9和Published成为索引，则可以在Station的结果上调用.reset_index()来重置索引。示例 -

pivot_table()

检测重复项并创建汇总行

1 个答案: