Question

我在一组卡车存货上设置了维护/行驶数据我必须得出有关维护的关键问题，并将它们组合在一起客户提供的逻辑如下：

如果主要作品与作品之间的行程少于3次

如果关键字重叠（可以超过1个关键字）   然后将它们分组在一起。

这里是一个根据关键字，卡车ID，进/出的期望输出（“组”列）的样本：

    data =     {'Truck':['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B'], 
    'Date':['1/1/2019', '1/2/2019', '1/2/2019', '1/2/2019', '1/5/2019', '1/6/2019', '1/7/2019', '1/8/2019', '1/9/2019', '1/10/2019', '1/11/2019', '1/12/2019', '1/12/2019', '1/18/2019', '1/19/2019', '1/20/2019', '1/21/2019'],
     'Description': ['need oil change', 'flat tire pump', 'filter replaced', 'oil filled', 'truck on the road', 'flat tire replace', 'nail in tire', 'fix tire', 'truck on the road', 'truck on the road', 'tire replace', 'oil leak', 'replaced ring due to leak', 'truck on the road', 'truck on the road', 'truck on the road', 'replace filter due to leak'
],
 'Key_Words': ['oil', 'tire', 'filter, replace', 'oil', '', 'tire', 'tire', 'tire', '', '', 'tire', 'oil, leak', 'leak, replace', '', '', '', 'filter,replace,leak'], 
'Type_of_Work': ['main', 'main', 'assist', 'assist', '', 'main', 'assist', 'assist', '', '', 'main', 'main', 'assist', '', '', '', 'main'], 
'In_Out':['in', 'in', 'in', 'in', 'trip', 'in', 'in', 'in', 'trip', 'trip', 'in', 'in', 'in', 'trip', 'trip', 'trip', 'in'],
'Group':['1', '2', '3', '1', '', '2', '2', '2', '', '', '4', '5', '5', '', '', '', '6']}
df = pd.DataFrame(data)

我试图分配“主要”工作并将其归类，但似乎有时assis与主要工作无关，有时它属于较早开始的工作。我没有工作代码来跟踪工作关系。只有关键词会有所帮助。我知道如何计算两个给定的主要工程之间的旅行次数，但是我不确定如何保持运行总数。只要上次维护的行程次数少于3，并且问题仍然存在-我需要将其视为相同的问题并进行维护。

如何在熊猫数据框中执行条件分组？

0 个答案: