如何将具有重复索引的数据框转换为分组索引或Multindex数据框?

时间:2020-01-27 14:38:32

标签: python dataframe pandas-groupby

我有一个数据框,其中包含重复的日期作为索引。其他列中的值为“ longitude”,“ latitude”和“ altitude”。数据框如下所示:

Index                        lon          lat                     alt

2019-12-07 01:34:16.483601  -2.58   -107.609395 -29.976347          9019.0
2019-12-07 01:34:16.483601  3.77    -107.62478100000001 -29.979124  11158.0
2019-12-07 01:34:16.483601  3.82    -107.653606 -29.984322          15060.0
2019-12-07 01:34:16.483601  -2.02   -107.57474400000001 -29.970085  4610.0
2019-12-07 01:34:16.483601  29.62   -107.638236 -29.981551          13045.0
2019-12-07 01:34:16.859801  -0.99   -107.580839 -29.945495          4609.0
2019-12-07 01:34:16.859801  5.06    -107.675038 -29.962496          17076.0
2019-12-07 01:34:16.859801  -10.19  -107.56735499999999 -29.943056  2971.0
2019-12-07 01:34:16.859801  11.64   -107.62509600000001 -29.953492  10401.0
2019-12-07 01:34:16.859801  3.62    -107.619328 -29.952451          9646.0
2019-12-07 01:34:16.859801  6.06    -107.603939 -29.949671          7507.0
2019-12-07 01:34:16.859801  -1.54   -107.607787 -29.950366          8011.0
2019-12-07 01:34:17.236001  -5.84   -107.598477 -29.922998          6119.0
2019-12-07 01:34:17.236001  2.17    -107.60425000000001 -29.924041  6874.0
2019-12-07 01:34:17.612201  -38.43  -107.604602 -29.898415          6123.0
2019-12-07 01:34:17.988401  21.17   -107.63375500000001 -29.877999  9021.0
2019-12-07 01:34:17.988401  -0.27   -107.570255 -29.866514          713.0
2019-12-07 01:34:19.117001  13.77   -107.696124 -29.812198          15063.0
2019-12-07 01:34:19.493201  10.45   -107.598466 -29.768864          1340.0
2019-12-07 01:34:19.493201  11.1    -107.650375 -29.778255          8014.0

我想要拥有的是代替同一个索引使用许多行,而是想要拥有这样的东西:

Index                        lon          lat                     alt

2019-12-07 01:34:16.483601  -2.58   -107.609395 -29.976347          9019.0
                            3.77    -107.62478100000001 -29.979124  11158.0
                            3.82    -107.653606 -29.984322          15060.0
                            -2.02   -107.57474400000001 -29.970085  4610.0
                            29.62   -107.638236 -29.981551          13045.0
2019-12-07 01:34:16.859801  -0.99   -107.580839 -29.945495          4609.0
                            5.06    -107.675038 -29.962496          17076.0
                            -10.19  -107.56735499999999 -29.943056  2971.0
                            11.64   -107.62509600000001 -29.953492  10401.0
                            3.62    -107.619328 -29.952451          9646.0
                            6.06    -107.603939 -29.949671          7507.0
                            -1.54   -107.607787 -29.950366          8011.0
2019-12-07 01:34:17.236001  -5.84   -107.598477 -29.922998          6119.0
                            2.17    -107.60425000000001 -29.924041  6874.0
2019-12-07 01:34:17.612201  -38.43  -107.604602 -29.898415          6123.0
2019-12-07 01:34:17.988401  21.17   -107.63375500000001 -29.877999  9021.0
                            -0.27   -107.570255 -29.866514          713.0
2019-12-07 01:34:19.117001  13.77   -107.696124 -29.812198          15063.0
2019-12-07 01:34:19.493201  10.45   -107.598466 -29.768864          1340.0
                            11.1    -107.650375 -29.778255          8014.0

我想要创建一个干净的数据框的最终目标是,我可以使用df.resample(“ 1min”)。asfreq()进行升采样,同时又不能删除重复的索引,因此我不能使用 df.resample(“ 1min”)。asfreq()

我不想使用groupby.mean(),因为我不想丢失三列的详细信息。有什么方法可以使用groupby并达到上述数据帧? 任何建议和帮助,请多谢!谢谢!

0 个答案:

没有答案