我有一个数据集,具有每人指定数量的记录:
set.seed(99)
# Create values from a Poisson distribution
freqs <- rpois(100, 3)
# Add an ID to each row
freqs <- as.data.frame(freqs)
freqs$id <- seq_len(nrow(freqs))
我现在希望freqs$freqs
中的值是每个ID的观察次数。转换如下所示:
ID freqs
1 3
2 1
... ...
3 2
最后是:
ID freqs
1 3
1 3
1 3
2 1
... ....
3 2
3 2
答案 0 :(得分:2)
一个选项是uncount
中的tidyr
library(tidyr)
library(dplyr)
uncount(freqs, freqs, .remove = FALSE) %>%
as_tibble %>%
select(id, freqs)
答案 1 :(得分:2)
获取ID的另一个tidyverse
选项:
plyr::ldply(purrr::map2(freqs$id,freqs$freqs,function(x,y) rep(x,y)),
data.frame)
答案 2 :(得分:2)
as.data.frame(lapply(freqs, rep, freqs$freqs))
# freqs id
# 1 3 1
# 2 3 1
# 3 3 1
# 4 1 2
# 5 4 3
# 6 4 3
# 7 4 3
# 8 4 3
# 9 8 4
# 10 8 4
# 11 8 4
# 12 8 4
# 13 8 4
# 14 8 4
# 15 8 4
# 16 8 4
# ...
或
purrr::map_dfr(freqs, rep, freqs$freqs)
# # A tibble: 293 x 2
# freqs id
# <int> <int>
# 1 3 1
# 2 3 1
# 3 3 1
# 4 1 2
# 5 4 3
# 6 4 3
# 7 4 3
# 8 4 3
# 9 8 4
# 10 8 4
# # ... with 283 more rows