我在下面添加了示例数据,我使用了dplyr来依靠Rco
和month
:
structure(list(Rco = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 4L, 4L, 4L), .Label = c("A220", "B334", "C123", "D445"
), class = "factor"), month = structure(c(3L, 2L, 4L, 1L, 3L,
2L, 4L, 1L, 3L, 4L, 2L, 4L, 3L), .Label = c("Apr", "Feb", "Jan",
"Mar"), class = "factor"), count = c(1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13)), .Names = c("Rco", "month", "count"), row.names = c(NA,
-13L), class = "data.frame")
有没有办法将这些数据转换为:
structure(list(Rco = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("A220", "B334",
"C123", "D445"), class = "factor"), month = structure(c(3L, 2L,
4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 4L, 1L, 3L, 2L, 4L, 1L), .Label = c("Apr",
"Feb", "Jan", "Mar"), class = "factor"), count = c(1, 2, 3, 4,
5, 6, 7, 8, 9, 0, 10, 0, 13, 11, 12, 0)), .Names = c("Rco", "month",
"count"), row.names = c(NA, -16L), class = "data.frame")
所以基本上我需要为缺少计数的所有月份添加额外的行,因为如果dplyr::count
- month
组合不存在,Rco
不会给出0计数。
我的数据中的月数是变化的(我已经显示了1月2月3月4月,但也可能是所有12个月),所以如果有人能为我提供动态解决方案,我将不胜感激。
答案 0 :(得分:4)
您可以使用tidyr::complete
并指定填充为0(而不是默认NA):
library(tidyr)
complete(df, Rco, month, fill = list(count = 0))
答案 1 :(得分:2)
我们可以对前两列的expand.grid
值使用unique
,并使用初始数据集merge
。这将填充NA
以查找expand.grid
中不存在的组合。
res <- merge(expand.grid(lapply(df1[1:2], unique)), df1, all.x=TRUE)
但是,很容易将NA
更改为0
res[is.na(res)] <- 0