如何滞后并计算列表中每个数据帧的差异?

时间:2019-06-21 08:47:09

标签: r list dataframe lapply lag

我有一个包含981个数据帧的列表。每个data.frame具有相同的结构。

我想滞后一列(称为增长)以计算每个数据帧随时间(从一个观察到另一个观察)的增长。

我以某种方式尝试过lapply不能完成它。

my_list <- 
  list(
    data.frame(time = 1:10, growth = rnorm(10, mean = 1.3, sd = 2)),
    data.frame(time = 1:10, growth = rnorm(10, mean = 1.3, sd = 2)),
    data.frame(time = 1:10, growth = rnorm(10, mean = 1.3, sd = 2))
  )

2 个答案:

答案 0 :(得分:2)

如果您无法共享真实数据,则可以创建伪造的数据集以使帖子可重复。

如果我对您的理解正确,那么您可以使用lapply

lapply(list_df, function(x) {x$difference <- c(NA, diff(x$growth)); x})

#[[1]]
#   growth b difference
#1       3 a         NA
#2       8 b          5
#3       4 c         -4
#4       7 d          3
#5       6 e         -1
#6       1 f         -5
#7      10 g          9
#8       9 h         -1
#9       2 i         -7
#10      5 j          3

#[[2]]
#   growth b difference
#1      10 a         NA
#2       5 b         -5
#3       6 c          1
#4       9 d          3
#5       1 e         -8
#6       7 f          6
#7       8 g          1
#8       4 h         -4
#9       3 i         -1
#10      2 j         -1

tidyverse可以做到这一点

library(dplyr)
library(purrr)

map(list_df,. %>% mutate(difference = c(NA, diff(growth))))

OR

map(list_df,. %>% mutate(difference = growth - lag(growth)))

数据

set.seed(123)
list_df <- list(data.frame(growth = sample(10), b = letters[1:10]), 
               data.frame(growth = sample(10), b = letters[1:10]))

答案 1 :(得分:1)

我们可以将lapply中的transformbase R一起使用

lapply(list_df, transform, difference = c(NA, diff(growth)))
#[[1]]
#   growth b difference
#1       3 a         NA
#2      10 b          7
#3       2 c         -8
#4       8 d          6
#5       6 e         -2
#6       9 f          3
#7       1 g         -8
#8       7 h          6
#9       5 i         -2
#10      4 j         -1

#[[2]]
#   growth b difference
#1      10 a         NA
#2       5 b         -5
#3       3 c         -2
#4       8 d          5
#5       1 e         -7
#6       4 f          3
#7       6 g          2
#8       9 h          3
#9       7 i         -2
#10      2 j         -5

数据

set.seed(123)
list_df <- list(data.frame(growth = sample(10), b = letters[1:10]), 
               data.frame(growth = sample(10), b = letters[1:10]))