将长列表转换为具有重复项的二进制数据帧

时间:2018-05-21 06:05:13

标签: r dplyr

根据这个问题和答案,可以将长列表转换为二进制数据帧。

但是,怎样才能将它用于每个用户多次包含相同值的数据框?

数据框示例:

d_long <- data.frame( nameid = c("sally","sally","sally", "sally","Robert","annie","annie","annie"), value = c("product1","ra","ent","ra","ra","ra","product1","product1"))
nameid    value
1  sally product1
2  sally       ra
3  sally      ent
4  sally       ra
5 Robert       ra
6  annie       ra
7  annie product1
8  annie product1

预期的输出是:

d_exist <- data.frame(nameid = c("sally","Robert","annie"), product1 = c(1,0,1), ra = c(1,1,1), ent = c(1,0,0))
nameid product1 ra ent
1  sally        1  1   1
2 Robert        0  1   0
3  annie        1  1   0

但是当我尝试这个时:

d_long %>% group_by(nameid, value) %>%
     mutate(count = n()) %>%
     ungroup() %>%
     spread(value, count, fill = 0) %>%
     as.data.frame()

我收到错误消息:

  

错误:行(7,8),(2,4)

的重复标识符

仅使用

是否正确
d_long[!duplicated(d_long), ]

1 个答案:

答案 0 :(得分:1)

我们可以使用distinct,然后执行spread

library(tidyverse)
d_long %>%
  distinct %>% 
  mutate(n = 1) %>% 
  spread(value, n, fill = 0)
#    nameid ent product1 ra
#1  annie   0        1  1
#2 Robert   0        0  1
#3  sally   1        1  1