数据帧列表上的自定义函数

时间:2017-05-18 12:04:45

标签: r list function dataframe custom-function

我有以下数据 -

df = data.frame(a=rnorm(10), b=rnorm(10), c=rnorm(10), d=rnorm(10), e=rnorm(10), f=rnorm(10), g=rnorm(10), h=rnorm(10), 
      i=rnorm(10), j=rnorm(10), u=rnorm(10), v=rnorm(10), w=rnorm(10))

list1 <- data.frame(x = c("a", "b", "c"), y = "u")
list2 <- data.frame(x = c("e", "f", "g", "h"), y = "v")
list3 <- data.frame(x = c("i", "j"), y = "w")

the_list <- list(list1, list2, list3)

我想要的是以下内容 -

mod_u <- lm(u ~ a + b + c, data = df[,c("a","b","c","u")])
out_u <- tidy(mod_u)
mod_v <- lm(v ~ e + f + g + h, data = df[,c("e","f","g","h","v")])
out_v <- tidy(mod_v)
mod_w <- lm(w ~ i + j, data = df[,c("i","j","w")])
out_w <- tidy(mod_w)

请告诉我一个合适的方法,因为我被困住了,完全需要这样的输出。我还需要对输出做很多事情(out1,out2,out3),但我一开始就陷入困境。提前谢谢!

1 个答案:

答案 0 :(得分:0)

如果我们需要根据the_list

中的数据集获取输出
lapply(the_list, function(dat) tidy(lm(paste(as.character(unique(dat$y)), 
                   '~', paste(dat$x, collapse="+")), data = df)))
#[1]]
#         term   estimate std.error  statistic   p.value
#1 (Intercept) -0.6147594 0.4857711 -1.2655331 0.2526020
#2           a -0.2226719 0.1657775 -1.3431973 0.2277829
#3           b  0.2713822 0.4310743  0.6295485 0.5521921
#4           c -0.2110712 0.3872685 -0.5450255 0.6053869

#[[2]]
#         term    estimate std.error  statistic   p.value
#1 (Intercept)  0.81270636 0.4430282  1.8344347 0.1260466
#2           e -0.35993485 0.2882703 -1.2486018 0.2670861
#3           f -0.04874279 0.3310058 -0.1472566 0.8886832
#4           g -0.82340169 0.6387350 -1.2891132 0.2537732
#5           h  0.11134583 0.3392967  0.3281666 0.7560826

#[[3]]
#         term    estimate std.error   statistic   p.value
#1 (Intercept) -0.01385133 0.4916071 -0.02817561 0.9783085
#2           i  0.28314525 0.5384137  0.52588791 0.6152094
#3           j -0.20502088 0.4558995 -0.44970634 0.6665173