Question

I have below dataframe. Where START+TIME=END I want ti check id END of current row = START of next row then merge that 2 rows providing "ID" hsould the same

所以输出应该看起来像， So the output is as below

Answer 1

我不确定您的数据的格式如何正确，但是您可以替换。我建议您使用numpy并尝试一些方法：

public static String removeTillWord(String input, String word) {
    return input.substring(input.indexOf(word));
}

removeTillWord("Your String", "\");

Answer 2

样本DF

       Start    Time   End      ID
  0    43500    60     43560    23
  1    43560    60     43620    23 
  2    43620    1020   44640    24 
  3    44640    260    44900    24
  4    44900    2100   47000    24

代码：

a = df["ID"].tolist()
arr = [] 
t = True
for i in sorted(list(set(a))):
    j = 1
    k = 0
    temp = {}
    tempdf = df[df["ID"] == i]
    temp["Start"] = tempdf.iloc[k]["Start"]
    temp["Time"] = tempdf.iloc[k]["Time"]
    temp["End"] = tempdf.iloc[k]["End"]   
    temp["ID"] = tempdf.iloc[k]["ID"]

    while j < len(tempdf):
        if temp["End"] == tempdf.iloc[j]["Start"]:
            temp["End"] = tempdf.iloc[j]["End"]
            temp["Time"] += tempdf.iloc[j]["Time"] 
         j += 1  
        
    arr.append(temp)       

df = pd.DataFrame(arr)

输出DF：

        Start   Time    End     ID
   0    43500   120     43620   23   
   1    43620   3380    47000   24

熊猫比较下一行并根据条件合并

2 个答案: