熊猫:在列中找到重复的模式并将其分组为循环

时间:2018-10-11 20:58:05

标签: python pandas dataframe

我有一列的值为'loading','unloading','nan'。我想按此顺序查找“加载”和“卸载”的模式,并将那些相应的行标记为cycle1,cycl2等。

enter image description here

图片显示了一个这样的序列,其中“ loading”和“ unloading”,我希望新列的所有行的值都为“ 1”,而下一个“ loading”和“ unloading”序列为“ 2”等等。

我没有逻辑可以告诉您,但是如果您能帮助我,我将不胜感激。下图显示了我的期望

enter image description here

2 个答案:

答案 0 :(得分:0)

这是一种基于循环的方法。如果有人能更好地利用熊猫,我会感到很兴奋。

import pandas as pd

data = {'Event': ['Start','Going','Stop','Start','Stop','Start','Start','Going','Going','Going','Stop','Stop','Start','Stop']}


df = pd.DataFrame(data)

cycle = 0            
new_cycle = True
cycles = []
for x in df.Event:
    if new_cycle and x == 'Start':
        new_cycle = False
        cycle += 1
    elif x == 'Stop':
        new_cycle = True
    cycles.append(cycle)

df['cycles'] = cycles
print(df)

输出

    Event  cycles
0   Start       1
1   Going       1
2    Stop       1
3   Start       2
4    Stop       2
5   Start       3
6   Start       3
7   Going       3
8   Going       3
9   Going       3
10   Stop       3
11   Stop       3
12  Start       4
13   Stop       4

答案 1 :(得分:0)

你可以用这样的方法来移动数据框:

import pandas as pd

data = {'event': ['loading','loading','loading','unloading','unloading',
'loading','unloading','unloading','loading','loading','loading',
'loading','loading','loading']}
df = pd.DataFrame(data)

df_shifted= df[["event"]].shift()

# serie of booleans
condition_results = ((df["event"] == "loading") & (df_shifted["event"]=="unloading")) 

df["cycle"] = (condition_results).cumsum() #cumsum adds one on each true

Results