将表中的列转换为R中的合并行,并加上整数

时间:2018-06-22 13:39:22

标签: r rstudio

我有此数据表,其中显示了在杂货店的来电。 call time是呼叫进入的时间,activity_time是员工开始使用软件的时间,activity_des是完成的活动的描述,call end是呼叫发生的时间完成,最后,activity duration是每次活动的持续时间

            Date | Call_time | activity_time | activity_des| Call_end | activity_duration
      -----------------------------------------------------------------------------------
      2017-05-03 | 08:05:53  |   08:06:03    | Online shop | 08:07:03 | 30 
      2017-05-03 | 08:07:30  |   08:08:00    | Transfer    | 08:10:00 | 25
      2017-05-03 | 08:07:30  |   08:08:25    | buy         | 08:10:00 | 35
      2017-05-03 | 08:07:30  |   08:09:00    | receipt     | 08:10:00 | 60
      2017-05-04 | 14:34:10  |   14:40:00    | question    | 14:41:47 | 66 
      2017-05-04 | 14:34:10  |   14:41:06    | question    | 14:41:47 | 39
      ......     |  .....    |     .....     |   .....     |   .....  | ..

所需的输出

            Date | Call_time |      activity_des      | Call_end    | activities_duration
      ---------------------------------------------------------------------------
      2017-05-03 | 08:05:53  |   Online shop          | 08:07:03    | 30 
      2017-05-03 | 08:07:30  |   Transfer,buy,receipt | 08:10:00    | 120
      2017-05-04 | 14:34:10  |   question             | 14:41:47    | 105
      ......     |  .....    |          .....         |   .....     | ..

因此,由于不需要activity_time,因此将其删除,将同一调用中的不同activity_des合并在一起,然后将这些合并后的activitiy_duration添加到一个值中。 另外,如果有两个相同的活动依次发生(就像question一样),我不需要在合并后再显示两次,只需添加持续时间即可。 谢谢

1 个答案:

答案 0 :(得分:2)

使用tidyverse

library(tidyverse)    
activity %>%
  select(-activity_time) %>%
  group_by(Date, Call_time,Call_end) %>%
  summarize(activity_des = paste(activity_des,collapse=", "),
            activity_duration = sum(activity_duration))
# # A tibble: 3 x 5
# # Groups:   Date, Call_time [?]
#         Date Call_time Call_end           activity_des activity_duration
#        <chr>     <chr>    <chr>                  <chr>             <dbl>
# 1 2017-05-03  08:05:53 08:07:03            Online shop                30
# 2 2017-05-03  08:07:30 08:10:00 Transfer, buy, receipt               120
# 3 2017-05-04  14:34:10 14:41:47     question, question               105

数据

activity <- read.table(header=TRUE,stringsAsFactors=FALSE,sep="|",text="
Date | Call_time | activity_time | activity_des| Call_end | activity_duration
  2017-05-03 | 08:05:53  |   08:06:03    | Online shop | 08:07:03 | 30 
2017-05-03 | 08:07:30  |   08:08:00    | Transfer    | 08:10:00 | 25
2017-05-03 | 08:07:30  |   08:08:25    | buy         | 08:10:00 | 35
2017-05-03 | 08:07:30  |   08:09:00    | receipt     | 08:10:00 | 60
2017-05-04 | 14:34:10  |   14:40:00    | question    | 14:41:47 | 66 
2017-05-04 | 14:34:10  |   14:41:06    | question    | 14:41:47 | 39")

activity[] <- lapply(activity,trimws)
activity$activity_duration <- as.numeric(activity$activity_duration)