我尝试在多索引数据帧df上使用groupby创建新的数据帧。级别0是字符串标识符,级别1是日期时间索引。最后,我想确定每个vsl与每个DIV和DIS关联的总时间。这是df的片段:
DIV DIS
vsl BeginTime
vsl1 2015-08-19 16:40:00 SAD SAJ
2015-08-20 03:45:00 SAD SAJ
2015-08-20 13:55:00 SAD SAJ
...
vsl2 2015-06-11 07:10:00 NWD NWP
2015-06-11 16:35:00 NWD NWP
2015-06-12 01:50:00 NWD NWP
2015-06-12 11:25:00 NWD NWP
...
vsl3 2015-06-24 02:40:00 MVD MVN
2015-06-24 06:50:00 MVD MVN
2016-01-21 13:05:00 NAD NAN
2016-01-21 23:35:00 NAD NAN
...
[6594 rows x 2 columns]
我已经检查了How to iterate over pandas multiindex dataframe using index并想出了这个,这并不是我想要的:
for vsl, new_df in df.groupby(level=0):
vsl = new_df
我期待新的数据帧[' vsl1',vsl2',vsl3'],每个都包含groupby数据帧的内容,即对于vsl1:
DIV DIS
vsl BeginTime
vsl1 2015-08-19 16:40:00 SAD SAJ
2015-08-20 03:45:00 SAD SAJ
2015-08-20 13:55:00 SAD SAJ
...
[411 rows x 2 columns]
如果我打电话给vsl1:
In [102]: vsl1
Traceback (most recent call last):
File "<ipython-input-102-7a5664be723c>", line 1, in <module>
vsl1
NameError: name 'vsl1' is not defined
如果我打电话给vsl:
In [103]: vsl
Out[103]:
DIV DIS
vsl BeginTime
vsl3 2015-06-24 02:40:00 MVD MVN
2015-06-24 06:50:00 MVD MVN
2016-01-21 13:05:00 NAD NAN
2016-01-21 23:35:00 NAD NAN
...
[412 rows x 2 columns]
我尝试打印,如参考文章中所示,作为测试:
In [104]: for vsl, new_df in df.groupby(level=0):
...: print(new_df)
...:
Out[104]:
DIV DIS
vsl BeginTime
vsl1 2015-08-19 16:40:00 SAD SAJ
2015-08-20 03:45:00 SAD SAJ
2015-08-20 13:55:00 SAD SAJ
...
[411 rows x 2 columns]
DIV DIS
vsl BeginTime
vsl2 2015-06-11 07:10:00 NWD NWP
2015-06-11 16:35:00 NWD NWP
2015-06-12 01:50:00 NWD NWP
2015-06-12 11:25:00 NWD NWP
...
[410 rows x 2 columns]
DIV DIS
vsl BeginTime
vsl3 2015-06-24 02:40:00 MVD MVN
2015-06-24 06:50:00 MVD MVN
2016-01-21 13:05:00 NAD NAN
2016-01-21 23:35:00 NAD NAN
...
[412 rows x 2 columns]
我缺少什么,以及如何为0级中包含的每个vsl创建新的数据帧?