下面是我的数据框
info date time file msg
0 INFO: 2018-09-12 16:10:10: view.py: phone
1 INFO: 2018-09-12 16:10:10: view.py: asdasd
2 INFO: 2018-09-12 16:10:43: view.py: contact start
3 INFO: 2018-09-12 16:10:43: view.py: contact end
4 INFO: 2018-09-12 16:11:36: view.py: app start
5 INFO: 2018-09-12 16:11:36: view.py: busy start
6 INFO: 2018-09-12 16:12:08: view.py: busy end
7 INFO: 2018-09-12 16:12:08: view.py: contact end
8 INFO: 2018-09-12 16:12:08: view.py: app end
9 INFO: 2018-09-12 16:12:08: view.py: phone
7 INFO: 2018-09-12 16:12:08: view.py: contact end
我想根据msg
列中的值将此数据帧拆分为多个数据帧。
如果要按“电话”作为值分割,我的数据框应该看起来像这样:
df1:
info date time file msg
0 INFO: 2018-09-12 16:10:10: view.py: phone
1 INFO: 2018-09-12 16:10:10: view.py: asdasd
2 INFO: 2018-09-12 16:10:43: view.py: contact start
3 INFO: 2018-09-12 16:10:43: view.py: contact end
4 INFO: 2018-09-12 16:11:36: view.py: app start
5 INFO: 2018-09-12 16:11:36: view.py: busy start
6 INFO: 2018-09-12 16:12:08: view.py: busy end
7 INFO: 2018-09-12 16:12:08: view.py: contact end
8 INFO: 2018-09-12 16:12:08: view.py: app end
df2:
info date time file msg
9 INFO: 2018-09-12 16:12:08: view.py: phone
7 INFO: 2018-09-12 16:12:08: view.py: contact end
答案 0 :(得分:0)
为可变数量的相关变量使用字典。在这里,您可以结合使用if(currentChar >= 48 && currentChar <= 57 ) {
+ GroupBy
:
cumsum
然后通过d = dict(tuple(df.groupby(df['msg'].eq('phone').cumsum())))
,d[1]
,...,d[2]
访问数据框。
结果:
d[n]