我正在寻找一种方法,通过变量值对数据框中的变量编写条件操作,我无法完全理解如何去做。
说我有以下动物收容所的每日数据:
shelters <- data.frame(
date = rep(seq(as.Date("2014-02-01"), length=10, by = "1 day"),10),
animal = sample(c("dog","cat","goldfish"), 100, replace = T),
intake = sample(1:10, 100, replace = T))
我想设置一个规则来增加数据框,以便在动物的某个动物的摄入量的运行总和时#34;变量达到给定值,每个&#34;摄入量#34;在达到该给定值的日期之后变为0。
让我们说每只动物的最大值如下:
dog = 90
cat = 100
goldfish = 85
我倾向于使用cumsum,但如何指定cumsum公式中的特定动物值?
答案 0 :(得分:0)
尝试:
shelters$indx <- setNames(c(90,100,85), c("dog", "cat", "goldfish") )[as.character(shelters$animal)]
library(dplyr)
shelters%>%
arrange(date, animal)%>%
group_by(animal) %>%
mutate(Sum=cumsum(intake), intake=ifelse(Sum >indx, 0, intake)) %>%
select(-indx, -Sum) %>%
head()
# date animal intake
#1 2014-02-01 cat 1
#2 2014-02-01 cat 4
#3 2014-02-01 cat 2
#4 2014-02-01 cat 2
#5 2014-02-01 dog 5
#6 2014-02-01 dog 9
如果您希望总和与最大值相同
res <- shelters %>%
arrange(date, animal)%>%
group_by(animal) %>%
mutate(Sum=cumsum(intake), index2 = ifelse(Sum>indx,intake-(Sum-indx), intake),
intake= ifelse(index2<0,0, index2))%>%
select(-indx, -Sum, -index2)
res %>% summarize(Sum=sum(intake))
#Source: local data frame [3 x 2]
# animal Sum
#1 cat 100
#2 dog 90
#3 goldfish 85