我有此数据表,其中显示了在杂货店的来电。
call time
是呼叫进入的时间,activity_time
是员工开始使用软件的时间,activity_des
是完成的活动的描述,call end
是呼叫发生的时间完成,最后,activity duration
是每次活动的持续时间
Date | Call_time | activity_time | activity_des| Call_end | activity_duration
-----------------------------------------------------------------------------------
2017-05-03 | 08:05:53 | 08:06:03 | Online shop | 08:07:03 | 30
2017-05-03 | 08:07:30 | 08:08:00 | Transfer | 08:10:00 | 25
2017-05-03 | 08:07:30 | 08:08:25 | buy | 08:10:00 | 35
2017-05-03 | 08:07:30 | 08:09:00 | receipt | 08:10:00 | 60
2017-05-04 | 14:34:10 | 14:40:00 | question | 14:41:47 | 66
2017-05-04 | 14:34:10 | 14:41:06 | question | 14:41:47 | 39
...... | ..... | ..... | ..... | ..... | ..
所需的输出
Date | Call_time | activity_des | Call_end | activities_duration
---------------------------------------------------------------------------
2017-05-03 | 08:05:53 | Online shop | 08:07:03 | 30
2017-05-03 | 08:07:30 | Transfer,buy,receipt | 08:10:00 | 120
2017-05-04 | 14:34:10 | question | 14:41:47 | 105
...... | ..... | ..... | ..... | ..
因此,由于不需要activity_time
,因此将其删除,将同一调用中的不同activity_des
合并在一起,然后将这些合并后的activitiy_duration
添加到一个值中。
另外,如果有两个相同的活动依次发生(就像question
一样),我不需要在合并后再显示两次,只需添加持续时间即可。
谢谢
答案 0 :(得分:2)
使用tidyverse
:
library(tidyverse)
activity %>%
select(-activity_time) %>%
group_by(Date, Call_time,Call_end) %>%
summarize(activity_des = paste(activity_des,collapse=", "),
activity_duration = sum(activity_duration))
# # A tibble: 3 x 5
# # Groups: Date, Call_time [?]
# Date Call_time Call_end activity_des activity_duration
# <chr> <chr> <chr> <chr> <dbl>
# 1 2017-05-03 08:05:53 08:07:03 Online shop 30
# 2 2017-05-03 08:07:30 08:10:00 Transfer, buy, receipt 120
# 3 2017-05-04 14:34:10 14:41:47 question, question 105
数据
activity <- read.table(header=TRUE,stringsAsFactors=FALSE,sep="|",text="
Date | Call_time | activity_time | activity_des| Call_end | activity_duration
2017-05-03 | 08:05:53 | 08:06:03 | Online shop | 08:07:03 | 30
2017-05-03 | 08:07:30 | 08:08:00 | Transfer | 08:10:00 | 25
2017-05-03 | 08:07:30 | 08:08:25 | buy | 08:10:00 | 35
2017-05-03 | 08:07:30 | 08:09:00 | receipt | 08:10:00 | 60
2017-05-04 | 14:34:10 | 14:40:00 | question | 14:41:47 | 66
2017-05-04 | 14:34:10 | 14:41:06 | question | 14:41:47 | 39")
activity[] <- lapply(activity,trimws)
activity$activity_duration <- as.numeric(activity$activity_duration)