我正在寻找一种方法将以下两个dplyr操作合并为一个,使我的操作更短,更优雅:
总体而言,如果条件counter
和x
得到满足,我想在y
之间创建一笔金额,然后将此金额分配给即将到来的4个季度(quarter
),不包括当前季度。有些情况会导致重叠,需要总结。我不能使用dplyr的lag()
,因为由于汇总功能,我在输出中没有所有后果。这就是为什么我不得不做一个“绕行”并将dplyr操作分成两部分。我现在正在寻找一种优雅的方法,在一次操作中完成所有操作并避免中间步骤。
#Rep example
compid <- c(replicate(10, "A"), replicate(10, "B"))
quarter <- c(11:20, 11:20)
x <- c(0,1,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0)
counter <- c(0,1,2,0,1,0,0,0,0,0,1,1,0,1,0,0,0,0,0,0)
dat <- data.frame(compid, quarter, x, counter)
#First, I create the sum count
dat %>%
group_by(compid, quarter) %>%
filter(x == 1) %>%
summarise(sumcount = sum(counter)) %>%
ungroup() -> temp
#Then, I did not know how to opearte in dplyr. I want to eliminate this intermediary step.
temp1 <- temp
temp1$quarter <- temp1$quarter + 1
temp2 <- temp
temp2$quarter <- temp2$quarter + 2
temp3 <- temp
temp3$quarter <- temp3$quarter + 3
temp4 <- temp
temp4$quarter <- temp4$quarter + 4
temp <- rbind(temp1, temp2, temp3, temp4)
#Lastly, I went back to dplyr to consolidate and refine the data
temp %>%
group_by(compid, quarter) %>%
summarise(sumcount = sum(sumcount)) %>%
right_join(dat, by = c("compid", "quarter")) %>%
mutate(sumcount = ifelse(is.na(sumcount), 0, sumcount))
答案 0 :(得分:1)
这是替换中间步骤的整齐方法:
dat %>%
# this chunk is unchanged
group_by(compid, quarter) %>%
filter(x == 1) %>%
summarise(sumcount = sum(counter)) %>%
ungroup() %>%
# this replaces the creation of temp datasets
mutate(Q1 = quarter + 1,
Q2 = quarter + 2,
Q3 = quarter + 3,
Q4 = quarter + 4) %>%
select(-quarter) %>%
tidyr::gather(key, quarter, -compid, -sumcount) %>%
select(compid, quarter, sumcount) %>%
# this chunk is unchanged
group_by(compid, quarter) %>%
summarise(sumcount = sum(sumcount)) %>%
right_join(dat, by = c("compid", "quarter")) %>%
mutate(sumcount = ifelse(is.na(sumcount), 0, sumcount))
# A tibble: 20 x 5
# Groups: compid [2]
compid quarter sumcount x counter
<fctr> <dbl> <dbl> <dbl> <dbl>
1 A 11 0 0 0
2 A 12 0 1 1
3 A 13 1 1 2
4 A 14 3 0 0
5 A 15 3 0 1
6 A 16 3 0 0
7 A 17 2 0 0
8 A 18 0 0 0
9 A 19 0 0 0
10 A 20 0 0 0
11 B 11 0 1 1
12 B 12 1 1 1
13 B 13 2 0 0
14 B 14 2 0 1
15 B 15 2 0 0
16 B 16 1 0 0
17 B 17 0 0 0
18 B 18 0 0 0
19 B 19 0 0 0
20 B 20 0 0 0
顺便说一句,我没有看到y
的任何条件,只有x
,但我认为这对于此处提出的问题并不重要。