对于以下数据框,我想使用“ mutate”基于“ Type”列值创建新列,并计算出现的实例数。数据应按“组”和“选择”进行分组。
随着时间的流逝,“类型”列将添加尚未列出的新值,因此代码在这方面应具有灵活性。
使用dplyr库是否有可能?
library(dplyr)
df <- data.frame(Group = c("A","A","A","B","B","C","C","D","D","D","D","D"),
Choice = c("Yes","Yes","No","No","Yes","Yes","Yes","Yes","No","No","No","No"),
Type = c("Fruit","Construction","Fruit","Planes","Fruit","Trips","Construction","Cars","Trips","Fruit","Planes","Trips"))
所需的结果应为以下内容:
result <- data.frame(Group = c("A","A","B","B","C","D","D"),
Choice = c("Yes","No","Yes","No","Yes","Yes","No"),
Fruit = c(1,1,0,1,0,0,1),
Construction = c(0,1,0,0,1,0,0),
Planes = c(0,0,1,0,0,0,1),
Trips = c(0,0,0,0,1,0,2),
Cars = c(0,0,0,0,0,1,0))
答案 0 :(得分:1)
我们可以先计数,然后spread
library(tidyverse)
df %>%
count(Group, Choice, Type) %>%
spread(Type, n, fill = 0)
# A tibble: 7 x 7
# Group Choice Cars Construction Fruit Planes Trips
# <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A No 0 0 1 0 0
#2 A Yes 0 1 1 0 0
#3 B No 0 0 0 1 0
#4 B Yes 0 0 1 0 0
#5 C Yes 0 1 0 0 1
#6 D No 0 0 1 1 2
#7 D Yes 1 0 0 0 0