基于0或1的条件的pivot_wider

时间:2019-11-28 11:57:29

标签: r tidyr

我正在尝试对数据使用pivot_wider。数据如下:

       dates yes_no
1 2017-01-01      0
2 2017-01-02      1
3 2017-01-03      0
4 2017-01-04      1
5 2017-01-05      1

我要在哪里获得预期的输出?

       dates yes_no 2017-01-02_1   2017-01-04_1     2017-01-05_1  
1 2017-01-01      0      0            0                 0
2 2017-01-02      1      1            0                 0
3 2017-01-03      0      0            0                 0
4 2017-01-04      1      0            1                 0 
5 2017-01-05      1      0            0                 1

spread列为1英寸时数据为yes_no的地方。

这对我不起作用:

d %>% 
  mutate(value_for_one_hot = 1) %>%
  pivot_wider(names_from = dates, values_from = value_for_one_hot,
            names_prefix = "date_", values_fill = list(value_for_one_hot = 0)) 

数据:

data.frame(
  dates = c("2017-01-01", "2017-01-02", "2017-01-03", "2017-01-04", "2017-01-05"),
  yes_no = c(0, 1, 0, 1, 1) 
)

2 个答案:

答案 0 :(得分:1)

yes_no创建一个重复列,为列名称创建一个新列,然后执行常规的spreadpivot_wider

library(dplyr)
library(tidyr)
df %>% mutate(yes_no_dup=yes_no, cols=if_else(yes_no==1, paste0(dates,'_1'), NA_character_)) %>% 
       spread(cols, yes_no_dup, fill = list(yes_no_dup = 0)) %>% 
       select(-`<NA>`)

答案 1 :(得分:1)

这是一种方法,实际上并没有改变数据的形状。

library(data.table)
setDT(d)

ind <- d[['yes_no']] != 0
cols <- as.character(d[['dates']])[ind]

d[, (cols) := 0L]
d[ind, (cols) := as.data.frame(diag(.N))]

## also valid
# set(d, which(ind), cols, as.data.frame(diag(length(cols))))

d