如何在DataFrame中将B列转置为y轴列

时间:2019-01-25 23:20:57

标签: python pandas seaborn

假设我们有一个类似以下3列的数据框

    timestamp  bin  cnt
0  1548453780  0.2    0
1  1548453780  0.3    5
2  1548453780  0.4    0
3  1548453780  0.5    3
4  1548453780  0.6    0

您将如何生产?

          bin  0.2  0.3  0.4 
    timestamp   
   1548453780  0    5    10
   1548453782  2    3     0

如何使用枢轴生成如下所示的结构?我已经尝试了来自pandas的各种groupby和ivot_table:df.groupby(['timestamp','bin']).sum(),但是bin列并没有像下面的示例那样沿顶部结束。

Seaborn pydata的示例包含航班数据:

https://seaborn.pydata.org/generated/seaborn.heatmap.html

   year     month  passengers
0  1949   January         112
1  1949  February         118
2  1949     March         132
3  1949     April         129
4  1949       May         121

做一个枢轴:flights.pivot("month", "year", "passengers")

产生这个:

year       1949  1950  1951  1952  1953  1954  1955  1956  1957  1958  1959  \
month                                                                         
January     112   115   145   171   196   204   242   284   315   340   360   
February    118   126   150   180   196   188   233   277   301   318   342   
March       132   141   178   193   236   235   267   317   356   362   406   
April       129   135   163   181   235   227   269   313   348   348   396   
May         121   125   172   183   229   234   270   318   355   363   420   
June        135   149   178   218   243   264   315   374   422   435   472   
July        148   170   199   230   264   302   364   413   465   491   548   
August      148   170   199   242   272   293   347   405   467   505   559   
September   136   158   184   209   237   259   312   355   404   404   463   
October     119   133   162   191   211   229   274   306   347   359   407   
November    104   114   146   172   180   203   237   271   305   310   362   
December    118   140   166   194   201   229   278   306   336   337   405   

1 个答案:

答案 0 :(得分:1)

假设您有一个这样的数据框

import pandas as pd

df = pd.DataFrame({"timestamp" : [1548453780] *3 + [1548453782] *3,
                   "bins" : [0.2, 0.3, 0.4] * 2 ,
                   "cnt" : [0,5,10,2,3,0]})

看起来像

    timestamp  bins  cnt
0  1548453780   0.2    0
1  1548453780   0.3    5
2  1548453780   0.4   10
3  1548453782   0.2    2
4  1548453782   0.3    3
5  1548453782   0.4    0

然后您可以将其设置为

piv = df.pivot("timestamp", "bins", "cnt")

并获得所需的输出:

bins        0.2  0.3  0.4
timestamp
1548453780    0    5   10
1548453782    2    3    0