python dataframe, get list of start and end of events

时间:2015-11-12 11:58:33

标签: python algorithm pandas

i have a dataframe and a column with integer values (in my case 0 and 1). The index is time. I need a list when the "areas" with ones start and end. I could do that with diff and followed by loop.

Example:

import pandas as pd
df = pd.DataFrame(index = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
df['test'] = pd.DataFrame([0, 1, 1, 1, 0, 0, 1, 1, 1, 0], index = df.index)

methodOfLooking = ((2,4),(7,9)) # something like this should be the result

Any ideas of an efficient way to get the result?

1 个答案:

答案 0 :(得分:2)

You can use diff and zip to get the start and end indexes:

ix = df.test.diff().fillna(0)

In [74]: zip(df.index[ix==1],df.index[ix==-1]-1)
Out[74]: [(2, 4), (7, 9)]