我有一个如下所示的数据框,我想根据周来对此进行热议,
id week
345 1
351 2
222 4
264 3
345 5
277 2
345 2
345 2
264 5
...
这是我理想的输出:
id week1 week2 week3 week4 week5
345 1 2 0 0 1
351 0 1 0 0 0
222 0 0 0 1 0
264 0 0 1 0 1
277 0 1 0 0 0
...
我对这个问题的想法是基于组合此数据帧的一种热编码的,但是它非常复杂,
任何人都知道我可以在R中获得此输出吗?
答案 0 :(得分:0)
我相信这可以做得更优雅,但这可以完成工作。
# Libraries
library(dplyr)
library(tidyr)
# Dataframe
data <- "id week
345 1
351 2
222 4
264 3
345 5
277 2
345 2
345 2
264 5"
df <- read.table(text = data, header = TRUE)
# All at once
df <- df %>%
group_by(id, week) %>%
summarise(count = n()) %>%
mutate(week = paste0("week", week)) %>%
spread(week, count)
# Setting NA to zero
df[is.na(df)] <- 0
答案 1 :(得分:0)
使用tidyverse
:
df %>%
mutate(week = paste("week", week, sep = "")) %>%
group_by(id, week) %>%
summarise(n = n()) %>%
ungroup() %>%
spread(key = week, value = n) %>%
mutate_all(funs(replace(., is.na(.), 0)))
# A tibble: 5 x 6
id week1 week2 week3 week4 week5
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 222. 0. 0. 0. 1. 0.
2 264. 0. 0. 1. 0. 1.
3 277. 0. 1. 0. 0. 0.
4 345. 1. 2. 0. 0. 1.
5 351. 0. 1. 0. 0. 0.