如何最好地总结R中的下表数据框?
Driver_ID <- c('AB1','AB1')
Date_today<- as.Date(c('2018-10-24','2018-10-24'))
Motor_Vehicle_Brand <- c('Toyota','VW')
Type_of_vehicle <- c('Corrola','Golf 5')
Country <- c('USA','USA')
Speed <- as.numeric(c('300','400'))
Number_of_brands_drived <- as.numeric(c('1','1'))
car.data <- data.frame(Driver_ID, Date_today, Motor_Vehicle_Brand,Type_of_vehicle,Country,Speed,Number_of_brands_driven)
显示
Driver_ID <- 'AB1'
Date_today<- as.Date('2018-10-24')
Motor_Vehicle_Brand <- c('Toyota VW')
Type_of_vehicle <- 'Corrola Golf 5'
Country <- 'USA'
Speed <- as.numeric('700')
Number_of_brands_drived <- as.numeric('2')
car.data <- data.frame(Driver_ID, Date_today, Motor_Vehicle_Brand,Type_of_vehicle,Country,Speed,Number_of_brands_driven)
我尝试了以下代码,但是未能按照我想要的方式进行分组,
df %>%
group_by(DRIVER_ID, Country) %>%
mutate(Highest_speed = sum(Highest speed driven),
Number_of_brands_driven = sum(Number_of_brands_drived))
但是,这给了我一个未分组的数据,类似于未汇总的数据。
请协助。
答案 0 :(得分:0)
正如我在评论中所写,您需要使用summarise
library(dplyr)
car.data %>%
group_by(Driver_ID, Country) %>%
summarise(Highest_speed = sum(Speed),
Number_of_brands_driven = sum(Number_of_brands_drived))
# A tibble: 1 x 4
# Groups: Driver_ID [?]
Driver_ID Country Highest_speed Number_of_brands_driven
<fct> <fct> <dbl> <dbl>
1 AB1 USA 700 2
编辑:将Motor_Vehicle_Brand添加到摘要中。
将品牌添加到摘要中而不创建重复记录就是将记录粘贴在一起。
car.data %>%
group_by(Driver_ID, Country) %>%
summarise(Highest_speed = sum(Speed),
Number_of_brands_driven = sum(Number_of_brands_driven),
brands = paste(Motor_Vehicle_Brand, collapse = ", "))
# A tibble: 1 x 5
# Groups: Driver_ID [?]
Driver_ID Country Highest_speed Number_of_brands_driven brands
<fct> <fct> <dbl> <dbl> <chr>
1 AB1 USA 700 2 Toyota, VW