我不理解使用 tidyr 的nesting
+ complete
会导致错误的结果。我已经阅读了帮助手册并尝试了示例,但我仍然无法在tidyverse中生成我想要的内容。
这是我的数据:
## project_id meta domain1 question_count tag_count
## <int> <chr> <chr> <int> <int>
## 1 1 A d 3 2
## 2 1 A e 3 1
## 3 1 B h 3 3
## 4 1 B i 3 2
## 5 2 A d 2 1
## 6 2 B i 2 1
## 7 2 C k 2 2
## project_id meta domain1 question_count tag_count
## <int> <chr> <chr> <int> <int>
## 1 1 A d 3 2
## 2 1 A e 3 1
## 3 1 B h 3 3
## 4 1 B i 3 2
## 5 1 C k 3 0
## 6 2 A d 2 1
## 7 2 A e 2 0
## 8 2 B h 2 1
## 9 2 B i 2 1
## 10 2 C k 2 2
## MWE
library(dplyr); library(tidyr)
dat <- data_frame(
project_id = as.integer(c(rep(1, 4), rep(2, 3))),
meta = c('A', 'A', 'B', 'B', 'A', 'B', 'C'),
domain1 = c('d', 'e', 'h', 'i', 'd', 'i', 'k'),
question_count = as.integer(c(rep(3, 4), rep(2, 3))),
tag_count = as.integer(c(2, 1, 3, 2, 1, 1, 2))
)
## Failed Attempts
dat %>%
complete(
project_id,
tidyr::nesting(question_count, meta, domain1),
fill = list(tag_count = 0)
)
## # A tibble: 14 x 5
## project_id question_count meta domain1 tag_count
## <int> <int> <chr> <chr> <dbl>
## 1 1 2 A d 0
## 2 1 2 B i 0
## 3 1 2 C k 0
## 4 1 3 A d 2.00
## 5 1 3 A e 1.00
## 6 1 3 B h 3.00
## 7 1 3 B i 2.00
## 8 2 2 A d 1.00
## 9 2 2 B i 1.00
## 10 2 2 C k 2.00
## 11 2 3 A d 0
## 12 2 3 A e 0
## 13 2 3 B h 0
## 14 2 3 B i 0
dat %>%
complete(
project_id, question_count,
tidyr::nesting(meta, domain1),
fill = list(tag_count = 0)
)
## # A tibble: 20 x 5
## project_id question_count meta domain1 tag_count
## <int> <int> <chr> <chr> <dbl>
## 1 1 2 A d 0
## 2 1 2 A e 0
## 3 1 2 B h 0
## 4 1 2 B i 0
## 5 1 2 C k 0
## 6 1 3 A d 2.00
## 7 1 3 A e 1.00
## 8 1 3 B h 3.00
## 9 1 3 B i 2.00
## 10 1 3 C k 0
## 11 2 2 A d 1.00
## 12 2 2 A e 0
## 13 2 2 B h 0
## 14 2 2 B i 1.00
## 15 2 2 C k 2.00
## 16 2 3 A d 0
## 17 2 3 A e 0
## 18 2 3 B h 0
## 19 2 3 B i 0
## 20 2 3 C k 0
dat %>%
complete(
meta, domain1,
tidyr::nesting(project_id, question_count),
fill = list(tag_count = 0)
)
## # A tibble: 30 x 5
## meta domain1 project_id question_count tag_count
## <chr> <chr> <int> <int> <dbl>
## 1 A d 1 3 2.00
## 2 A d 2 2 1.00
## 3 A e 1 3 1.00
## 4 A e 2 2 0
## 5 A h 1 3 0
## 6 A h 2 2 0
## 7 A i 1 3 0
## 8 A i 2 2 0
## 9 A k 1 3 0
## 10 A k 2 2 0
## # ... with 20 more rows
答案 0 :(得分:4)
我们可以将两个nesting
用于两组列。
library(dplyr)
library(tidyr)
dat %>%
complete(
nesting(meta, domain1),
nesting(project_id, question_count),
fill = list(tag_count = 0)
) %>%
arrange(project_id, meta) %>%
select(names(dat))
# # A tibble: 10 x 5
# project_id meta domain1 question_count tag_count
# <int> <chr> <chr> <int> <dbl>
# 1 1 A d 3 2.00
# 2 1 A e 3 1.00
# 3 1 B h 3 3.00
# 4 1 B i 3 2.00
# 5 1 C k 3 0
# 6 2 A d 2 1.00
# 7 2 A e 2 0
# 8 2 B h 2 0
# 9 2 B i 2 1.00
# 10 2 C k 2 2.00