您好我想用pandas将具有相同值的行的分钟数相加。
输入数据:
期望的输出:
代码
def get_full_duration(df):
for i in range(len(df)):
#sort df
df.sort_values(by=['zone','name'],inplace=True,axis=0)
#I use shift to get the next rows
df[['next_zone','next_name','next_minute']]=df[['zone','name','minute']].shift(-1)
#check to see if the next line is the same
df['same_case']=(df['next_zone']==df['zone'])&(df['next_name']==df['name'])&(df['next_minute']==df['minute'])
#merging info for case when they are same
df['next_minute']=df.apply(lambda row:row['next_minute'] if row['same_case'] else row['minute'],axis=1)
#Then to try and get a full duration:
df['full time']=(df['next_minute']+df['minute'])
return df
这是它的回归
似乎它占据了最后一个匹配的行,并将分钟数乘以2而不是得到唯一名称的分钟总和 - 任何想法或建议可能会出错,谢谢。
答案 0 :(得分:0)
df.groubpy(['zone','name']).agg({'minute':sum}).reset_index()
你需要什么,假设'分钟'是整数格式