R: get the best N values of all list subsets

时间:2017-08-04 12:15:43

标签: r list extract

I have the summaries of many linear models in a list called "listlmsummary".

listlmsummary <- lapply(listlm, summary)
listlmsummary

The output of listlmsummary looks like this (quite shortened):

$a
Residual standard error: 3835 on 1921 degrees of freedom
  (50 observations deleted due to missingness)
Multiple R-squared:   0.11, Adjusted R-squared:  0.1063 
F-statistic: 29.68 on 8 and 1921 DF,  p-value: < 2.2e-16

$b
Residual standard error: 3843 on 1898 degrees of freedom
  (68 observations deleted due to missingness)
Multiple R-squared:  0.1125,    Adjusted R-squared:  0.1065 
F-statistic: 18.51 on 13 and 1898 DF,  p-value: < 2.2e-16

$c
Residual standard error: 3760 on 1881 degrees of freedom
  (87 observations deleted due to missingness)
Multiple R-squared:  0.1221,    Adjusted R-squared:  0.117 
F-statistic: 23.79 on 11 and 1881 DF,  p-value: < 2.2e-16

$d
Residual standard error: 3826 on 1907 degrees of freedom
  (60 observations deleted due to missingness)
Multiple R-squared:  0.115, Adjusted R-squared:  0.1094 
F-statistic: 20.64 on 12 and 1907 DF,  p-value: < 2.2e-16

I want to extract the highest N (e.g. 2) Adjusted R-squared values to find the best model, and that it also tells me what list element this Adj.R-sqr value comes from. Does anyone have an idea how to do this?

I know that I can get a single R-squared value with this call:

listlmsummary[["a"]]$adj.r.squared

But extracting all R-squared values with something like this listlmsummary[[]]$adj.r.squared or listlmsummary[[c("a", "b", "c", "d")]]$adj.r.squaredand then ordering the output does not work.

Thank you for any help! :)

3 个答案:

答案 0 :(得分:4)

We can use sapply to extrat the adj.r.squared into a vector and order in decreasingly. Then get the head of 'n' elements from the ordered 'listlmsummary'

i1 <- order(-sapply(listlmsummary, `[[`, "adj.r.squared"))
head(listlmsummary[i1], n)

NOTE: This was answered with the logic and the complete solution requested by the user

答案 1 :(得分:3)

sapply(listlmsummary, function(x) x$adj.r.squared)

Also see the new broom package.

答案 2 :(得分:1)

快速而肮脏的方法可能是:

Maxr2sq <- max(unlist(sapply (listlm, "[", i = "adj.r.squared")))
Position <- which(unlist(sapply (listlm, "[", i = "adj.r.squared")) == Maxr2sq)
Maxr2sq
Position

但是,您可能会将所有结果存储在data.frame中以供将来参考。例如,理论上可能有多个Adj.R2获得相同的值。另外,存储回归的调用(即公式)很方便。

在这种情况下,你可以运行:

library(tidyverse)

AR2 <- sapply (listlm, "[", i = "adj.r.squared") %>%
       stack() %>% 
       select(values) %>% 
       rename(Adj.R.sqr = values)
Call <- as.character(sapply (listlm, "[", i = "call"))
Position <- setNames(data.frame(seq(1:length(listlm))), c("Position"))
DF <- as_data_frame(cbind(AR2,Call,Position))
DF