我有一个这样的DataFrame:
Frame 0 1 ... start_frame end_frame phn
0 0 7.648325 0.098433 ... 0.0 25.0 h#
1 1 8.006168 0.045991 ... 10.0 35.0 h#
2 2 8.260857 0.331792 ... 20.0 45.0 h#
3 3 8.211206 0.126892 ... 30.0 55.0 h#
4 4 7.999766 0.219560 ... 40.0 65.0 h#
5 5 7.602877 0.095582 ... 50.0 75.0 h#
6 6 7.747911 0.118326 ... 60.0 85.0 h#
7 7 7.958229 -0.049620 ... 70.0 95.0 h#
...
25 25 15.159771 2.047468 ... 250.0 275.0 sh
26 26 15.580827 1.910970 ... 260.0 285.0 ix
27 27 15.899938 1.510074 ... 270.0 295.0 ix
28 28 16.191772 1.646987 ... 280.0 305.0 ix
29 29 16.055186 1.585445 ... 290.0 315.0 ix
.. ... ... ... ... ... ... ...
336 336 15.277283 1.688955 ... 3360.0 3385.0 y
337 337 15.446976 1.615444 ... 3370.0 3395.0 ih
338 338 15.628509 1.944911 ... 3380.0 3405.0 ih
339 339 15.737163 1.736013 ... 3390.0 3415.0 ih
...
361 361 8.719288 -1.060700 ... 3610.0 3635.0 h#
362 362 8.500200 -0.810346 ... 3620.0 3645.0 h#
363 363 8.186726 -0.479683 ... 3630.0 3655.0 h#
364 364 8.151884 -0.277089 ... 3640.0 3665.0 h#
365 365 7.944815 -0.460370 ... 3650.0 3675.0 h#
我想为“ phn”列的每个连续值获取一个结构。例如:
1)第一次出现h#的行在范围(0,7)中的第一个矩阵
2)第二个矩阵,其中'sh'的行在range(value,25)中
依此类推,直到最后一个矩阵的行范围为(361,365),最后一次出现“ h#”。
答案 0 :(得分:1)
首先按连续值分组,然后创建列表或字典:
g = df['phn'].ne(df['phn'].shift()).cumsum()
#for list
L = [v for k, v in df.groupby(g)]
print (L)
#for dictionary
d = dict(tuple(g))
#alternative
d = {k: v for k, v in df.groupby(g)}
print (d)