我有以下数据集:
df <- data.frame(a=1:10,b=10:1)
我具有以下功能:
fun <- function(x,y) x*y/1000+x+y
我想要以下输出:
for (i in 2:10){df$a[i] = fun(df$a[i],df$a[i-1])};for (i in 2:10){df$b[i] = fun(df$b[i],df$b[i-1])}}
df
# a b
# 1 1.000000 10.00000
# 2 3.002000 19.09000
# 3 6.011006 27.24272
# 4 10.035050 34.43342
# 5 15.085225 40.64002
# 6 21.175737 45.84322
# 7 28.323967 50.02659
# 8 36.550559 53.17667
# 9 45.879514 55.28303
# 10 56.338309 56.33831
本质上,第i行的元素是最后一行和当前行输出的函数,并且这是递归执行的。有更好的方法吗?
答案 0 :(得分:4)
我们可以使用accumulate
包中的purrr
函数。
library(purrr)
df <- data.frame(a=1:10,b=10:1)
fun <- function(x,y) x*y/1000+x+y
df$a <- accumulate(df$a, fun)
df$b <- accumulate(df$b, fun)
df
# a b
# 1 1.000000 10.00000
# 2 3.002000 19.09000
# 3 6.011006 27.24272
# 4 10.035050 34.43342
# 5 15.085225 40.64002
# 6 21.175737 45.84322
# 7 28.323967 50.02659
# 8 36.550559 53.17667
# 9 45.879514 55.28303
# 10 56.338309 56.33831
答案 1 :(得分:2)
base R
和Reduce
一起使用accumulate = TRUE
df[] <- lapply(df, function(x) Reduce(fun, x, accumulate = TRUE))
df
# a b
#1 1.000000 10.00000
#2 3.002000 19.09000
#3 6.011006 27.24272
#4 10.035050 34.43342
#5 15.085225 40.64002
#6 21.175737 45.84322
#7 28.323967 50.02659
#8 36.550559 53.17667
#9 45.879514 55.28303
#10 56.338309 56.33831