date bal D
1: 1/31/2013 10 NA
2: 1/31/2013 11 NA
3: 1/31/2013 12 NA
4: 1/31/2013 13 NA
5: 1/31/2013 14 NA
6: 2/28/2013 20 NA
7: 2/28/2013 30 3.000000
8: 2/28/2013 40 3.636364
9: 2/28/2013 50 4.166667
10: 2/28/2013 60 4.615385
11: 3/30/2013 10 NA
12: 3/30/2013 11 0.550000
13: 3/30/2013 12 0.400000
14: 3/30/2013 13 0.325000
15: 3/30/2013 15 0.300000
根据以下内容:
library(data.table) # CRAN version 1.10.4 used
setDT(bb)[, D := bal / shift(bal, 6L)][seq(1L, nrow(bb), 5L), D := NA][]
现在我的问题是:
在每组的第4和第5位,答案应该打印100%,即9号,10号,14号和15号,依此类推,D下的值应为100%
D中的值应为%
预期o / p
date bal D
1: 1/31/2013 10 NA
2: 1/31/2013 11 NA
3: 1/31/2013 12 NA
4: 1/31/2013 13 100.00
5: 1/31/2013 14 100.00
6: 2/28/2013 20 NA
7: 2/28/2013 30 300.00
8: 2/28/2013 40 363.64
9: 2/28/2013 50 100.00
10: 2/28/2013 60 100.00
11: 3/30/2013 10 NA
12: 3/30/2013 11 55.00
13: 3/30/2013 12 40.00
14: 3/30/2013 13 100.00
15: 3/30/2013 15 100.00
这就是预期的产出。
答案 0 :(得分:2)
假设我的previous answer中的条件相同,即每个日期的行数始终相同。通过这种观察,只需将bal
的值滞后6行即可获得一个非常简单的解决方案。由于这首先忽略了组,因此有必要将结果D
设置为每个组中第一行的NA
,即最后每隔5行。
需要使用1.0
手动覆盖特定行的附加请求(打印为100%
)同样通过计算相应的索引来处理。
library(data.table)
setDT(bb)[, D := formattable::percent(bal / shift(bal, 6L))][seq(1L, .N, 5L), D := NA][
rep(seq(4L, nrow(bb), 5L), each = 2L) + 0:1, D := 1.0][]
date bal D 1: 1/31/2013 10 NA 2: 1/31/2013 11 NA 3: 1/31/2013 12 NA 4: 1/31/2013 13 100.00% 5: 1/31/2013 14 100.00% 6: 2/28/2013 20 NA 7: 2/28/2013 30 300.00% 8: 2/28/2013 40 363.64% 9: 2/28/2013 50 100.00% 10: 2/28/2013 60 100.00% 11: 3/30/2013 10 NA 12: 3/30/2013 11 55.00% 13: 3/30/2013 12 40.00% 14: 3/30/2013 13 100.00% 15: 3/30/2013 15 100.00%
请注意,percent
包中使用了formattable
函数。这样做的优点是值仍然是数字,可用于计算但以百分比形式打印。
根据OP的要求,这里也是一个不使用formattable::percent()
的版本:
setDT(bb)[, D := 100.0 * bal / shift(bal, 6L)][seq(1L, .N, 5L), D := NA][
rep(seq(4L, nrow(bb), 5L), each = 2L) + 0:1, D := 100.0][]
date bal D 1: 1/31/2013 10 NA 2: 1/31/2013 11 NA 3: 1/31/2013 12 NA 4: 1/31/2013 13 100.0000 5: 1/31/2013 14 100.0000 6: 2/28/2013 20 NA 7: 2/28/2013 30 300.0000 8: 2/28/2013 40 363.6364 9: 2/28/2013 50 100.0000 10: 2/28/2013 60 100.0000 11: 3/30/2013 10 NA 12: 3/30/2013 11 55.0000 13: 3/30/2013 12 40.0000 14: 3/30/2013 13 100.0000 15: 3/30/2013 15 100.0000
OP要求拥有一个动态版本,用户可以选择每个组中的哪些行为100.我试图制作一个完整的灵活版本,其中每个组中的元素数量也是动态的(仍需要在所有组中都是相同的)并将其打包为函数:
divide_by_group <- function(DF,
id_of_rows_in_group_to_override = NA,
val_override = 100.0) {
library(data.table)
# check parameters
checkmate::assert_data_frame(DF)
checkmate::assert_names(c("date", "bal"), subset.of = names(DF))
checkmate::assert_number(val_override)
# retrieve group length, verify all groups have the same length
l_grp <- setDT(DF)[, .N, by = date][
, if (any(N != first(N))) stop("Differing group lengths") else first(N)]
# verify user specified row ids
checkmate::assert_integerish(id_of_rows_in_group_to_override, lower = 1L, upper = l_grp)
# compute result
result <- DF[, D := 100.0 * bal / shift(bal, l_grp + 1L)][seq(1L, .N, l_grp), D := NA]
# apply override
# compute rows
rn <- c(outer(id_of_rows_in_group_to_override, seq(l_grp, nrow(DF) - l_grp, 5L), `+`))
# verify rn is in range
checkmate::assert_integerish(rn, lower = l_grp + 1L, upper = nrow(DF))
result[rn, D := val_override]
return(result[])
}
请注意,超过50%的代码用于检查参数和假设。
示例调用
divide_by_group(bb)
date bal D 1: 1/31/2013 10 NA 2: 1/31/2013 11 NA 3: 1/31/2013 12 NA 4: 1/31/2013 13 NA 5: 1/31/2013 14 NA 6: 2/28/2013 20 NA 7: 2/28/2013 30 300.0000 8: 2/28/2013 40 363.6364 9: 2/28/2013 50 416.6667 10: 2/28/2013 60 461.5385 11: 3/30/2013 10 NA 12: 3/30/2013 11 55.0000 13: 3/30/2013 12 40.0000 14: 3/30/2013 13 32.5000 15: 3/30/2013 15 30.0000
divide_by_group(bb, 4:5)
date bal D 1: 1/31/2013 10 NA 2: 1/31/2013 11 NA 3: 1/31/2013 12 NA 4: 1/31/2013 13 NA 5: 1/31/2013 14 NA 6: 2/28/2013 20 NA 7: 2/28/2013 30 300.0000 8: 2/28/2013 40 363.6364 9: 2/28/2013 50 100.0000 10: 2/28/2013 60 100.0000 11: 3/30/2013 10 NA 12: 3/30/2013 11 55.0000 13: 3/30/2013 12 40.0000 14: 3/30/2013 13 100.0000 15: 3/30/2013 15 100.0000
divide_by_group(bb, c(2, 5), -9.9)
date bal D 1: 1/31/2013 10 NA 2: 1/31/2013 11 NA 3: 1/31/2013 12 NA 4: 1/31/2013 13 NA 5: 1/31/2013 14 NA 6: 2/28/2013 20 NA 7: 2/28/2013 30 -9.9000 8: 2/28/2013 40 363.6364 9: 2/28/2013 50 416.6667 10: 2/28/2013 60 -9.9000 11: 3/30/2013 10 NA 12: 3/30/2013 11 -9.9000 13: 3/30/2013 12 40.0000 14: 3/30/2013 13 32.5000 15: 3/30/2013 15 -9.9000