我有一个包含981个数据帧的列表。每个data.frame具有相同的结构。
我想滞后一列(称为增长)以计算每个数据帧随时间(从一个观察到另一个观察)的增长。
我以某种方式尝试过lapply不能完成它。
my_list <-
list(
data.frame(time = 1:10, growth = rnorm(10, mean = 1.3, sd = 2)),
data.frame(time = 1:10, growth = rnorm(10, mean = 1.3, sd = 2)),
data.frame(time = 1:10, growth = rnorm(10, mean = 1.3, sd = 2))
)
答案 0 :(得分:2)
如果您无法共享真实数据,则可以创建伪造的数据集以使帖子可重复。
如果我对您的理解正确,那么您可以使用lapply
lapply(list_df, function(x) {x$difference <- c(NA, diff(x$growth)); x})
#[[1]]
# growth b difference
#1 3 a NA
#2 8 b 5
#3 4 c -4
#4 7 d 3
#5 6 e -1
#6 1 f -5
#7 10 g 9
#8 9 h -1
#9 2 i -7
#10 5 j 3
#[[2]]
# growth b difference
#1 10 a NA
#2 5 b -5
#3 6 c 1
#4 9 d 3
#5 1 e -8
#6 7 f 6
#7 8 g 1
#8 4 h -4
#9 3 i -1
#10 2 j -1
tidyverse
可以做到这一点
library(dplyr)
library(purrr)
map(list_df,. %>% mutate(difference = c(NA, diff(growth))))
OR
map(list_df,. %>% mutate(difference = growth - lag(growth)))
数据
set.seed(123)
list_df <- list(data.frame(growth = sample(10), b = letters[1:10]),
data.frame(growth = sample(10), b = letters[1:10]))
答案 1 :(得分:1)
我们可以将lapply
中的transform
与base R
一起使用
lapply(list_df, transform, difference = c(NA, diff(growth)))
#[[1]]
# growth b difference
#1 3 a NA
#2 10 b 7
#3 2 c -8
#4 8 d 6
#5 6 e -2
#6 9 f 3
#7 1 g -8
#8 7 h 6
#9 5 i -2
#10 4 j -1
#[[2]]
# growth b difference
#1 10 a NA
#2 5 b -5
#3 3 c -2
#4 8 d 5
#5 1 e -7
#6 4 f 3
#7 6 g 2
#8 9 h 3
#9 7 i -2
#10 2 j -5
set.seed(123)
list_df <- list(data.frame(growth = sample(10), b = letters[1:10]),
data.frame(growth = sample(10), b = letters[1:10]))