Question

我有一个数据框，其中包含日期列和累积和列。累积和数据在某一点结束，我想使用公式计算日期列中其余日期的数据。我遇到的问题是让公式引用列中的前一个单元格，从计数恢复为0开始（历史累积金额结束）。

以下示例：

dates.1 <- c("2016-12-06","2016-12-07","2016-12-08","2016-12-09","2016-12-10","2016-12-11","2016-12-12","2016-12-13","2016-12-14")
count.1 <- c(1,3,8,10,0,0,0,0,0)
drift <- .0456


df.1 <- data.frame(cbind(dates.1,count.1))


for (i in df.1$count.1) {
  if (i == 0) {
head(df.1$count.1, n = 1L)+exp(drift+(qnorm(runif(5,0,1))))
  }
}

我无法通过for循环来计算它。

对于runif，n = 5的原因是因为这是我想要运行公式的未来条目数。

所需的输出将具有

的内容

print(df.1$count.1)

[1] 1 3 8 10 12 13 16 17 18

第4个元素之后的数字只是随机的，一般的想法是该列将被覆盖，保留历史数据并使用新的计算条目而不是零。

有什么想法吗？

Answer 1

无需循环。您可以通过首先确定cumsum停止的行索引来获得您想要的内容：

last.ind <- which(df.1$count.1==0)[1]-1

然后使用此last.ind重新启动cumsum：

set.seed(123)  ## for reproducibility
## simulation of rest of data to cumulatively sum
rest.of.data <- exp(drift+(qnorm(runif(5,0,1))))
df.1$count.1[last.ind:length(df.1$count.1)] <- cumsum(c(df.1$count.1[last.ind],rest.of.data))
print(df.1$count.1)
##[1]  1.00000  3.00000  8.00000 10.00000 10.59757 12.92824 13.75970 17.20085 22.17527

如果您确实想要使用循环，那么您应该执行以下操作，这会产生相同的结果但更慢：

for (i in seq_len(length(df.1$count.1))) {
  if (df.1$count.1[i] == 0) {
    df.1$count.1[i] <- df.1$count.1[i-1] + exp(drift+(qnorm(runif(1,0,1))))
  }
}

注意：

循环索引df1$.count.1不是值。
如果当前索引i的值为0，请使用i-1上的前一个值与要累计求和的数据的总和来覆盖该值。

此外，您不应使用cbind来创建data.frame。在这种情况下这样做会导致df.1$count.1成为factor而不是numeric。使用的数据是：

数据：

df.1 <- structure(list(dates.1 = structure(1:9, .Label = c("2016-12-06", "2016-12-07", "2016-12-08", "2016-12-09", "2016-12-10", "2016-12-11", "2016-12-12", "2016-12-13", "2016-12-14"), class = "factor"), count.1 = c(1, 3, 8, 10, 0, 0, 0, 0, 0)), .Names = c("dates.1", "count.1"), row.names = c(NA, -9L), class = "data.frame") ## dates.1 count.1 ##1 2016-12-06 1 ##2 2016-12-07 3 ##3 2016-12-08 8 ##4 2016-12-09 10 ##5 2016-12-10 0 ##6 2016-12-11 0 ##7 2016-12-12 0 ##8 2016-12-13 0 ##9 2016-12-14 0

参考R中列中间的前一个元素

1 个答案: