从ggplot2中的二阶多项式提取系数

时间:2019-02-11 23:22:58

标签: r ggplot2 regression lm summary

我一直在研究但尚未找到解决方案的方法是从正在生成的二阶多项式中提取系数。我不一定需要y = A + Bx Cx ^ 2的所有系数(主要只需要C),但是我不确定如何从变量plot_5访问信息。我已经阅读了一些有关summary,str,lm和post的内容,其中用户说该任务尚未实现到ggplot2中,因为ggplot2用于可视化。我想做的是在Infil_Data2中创建一个新变量(“ coef”),并从特定的站点ID重复系数值。

例如,下面的数据框将添加一个名为“ coef”的新变量,前11行为0.00854,随后11行为0.0154,最后其余11行为0.00839。行号并不是那么重要,因为数据将被相应地分组。

    library(purrr)
    library(tidyverse)
    library(ggpmisc)

    plot_5 <-
      Infil_Data2 %>% 
      split(.$Site_ID) %>% 
      map2(names(.),
           ~ggplot(.x, aes(Sqrt_Time.x, Cal_Vol_cm)) + 
             geom_point() +
             labs(title = paste(.y)) +
             theme(plot.title = element_text(hjust = 0.5)) + 
             stat_smooth(mapping = aes(x = Sqrt_Time.x, y = Cal_Vol_cm?),
                         method = "lm", se = FALSE, 
                         formula = y ~ poly(x, 2, raw = TRUE),
                         #check raw = TRUE, some say raw = FALSE gives a better fit
                         color = "red") +
             theme(plot.margin = unit(c(1, 5, 1, 1), "cm")) +
             stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label..,         sep = "~~~")),
                          label.x.npc = "left", label.y.npc = 0.90, #set the position of the eq
                          formula = y ~ poly(x, 2, raw = TRUE), parse = TRUE, rr.digits = 3))

    pdf("allplots5.pdf", onefile = TRUE)
    walk(plot_5, print)
    dev.off()

    Infil_Data2 <-
        structure(list(Time = c(0L, 30L, 60L, 90L, 120L, 150L, 180L, 
        210L, 240L, 270L, 300L, 0L, 30L, 60L, 90L, 120L, 150L, 180L, 
        210L, 240L, 270L, 300L, 0L, 30L, 60L, 90L, 120L, 150L, 180L, 
        210L, 240L, 270L, 300L), Site_ID = c("H1", "H1", "H1", "H1", 
        "H1", "H1", "H1", "H1", "H1", "H1", "H1", "H2", "H2", "H2", "H2", 
        "H2", "H2", "H2", "H2", "H2", "H2", "H2", "H3", "H3", "H3", "H3", 
        "H3", "H3", "H3", "H3", "H3", "H3", "H3"), Vol_mL = c(63, 62, 
        60, 59, 58, 56, 54, 52.5, 50, 48.5, 46.5, 82, 77, 73, 68, 65, 
        51, 56, 52, 47.5, 42.5, 37.5, 69, 67, 65, 63, 61, 60, 58, 56, 
        54, 51.5, 49), Sqrt_Time.x = c(0, 5.477225575, 7.745966692, 9.486832981, 
        10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
        16.43167673, 17.32050808, 0, 5.477225575, 7.745966692, 9.486832981, 
        10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
        16.43167673, 17.32050808, 0, 5.477225575, 7.745966692, 9.486832981, 
        10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
        16.43167673, 17.32050808), Cal_Vol_cm = c(0, 0.124339799, 0.373019398, 
        0.497359197, 0.621698996, 0.870378595, 1.119058194, 1.305567893, 
        1.616417391, 1.80292709, 2.051606688, 0, 0.621698996, 1.119058194, 
        1.74075719, 2.113776588, 3.854533778, 3.232834782, 3.730193979, 
        4.289723076, 4.911422072, 5.533121068, 0, 0.248679599, 0.497359197, 
        0.746038796, 0.994718394, 1.119058194, 1.367737792, 1.616417391, 
        1.865096989, 2.175946488, 2.486795986)), row.names = c(NA, 33L
        ), class = "data.frame")

1 个答案:

答案 0 :(得分:0)

我没有您的数据,但这是使用broomlm中提取系数的示例:

library(tidyverse)
library(broom)

lm(mpg ~ wt + I(wt^2), data = mtcars) %>%
  tidy() %>%
  filter(term == "I(wt^2)") %>%
  pull(estimate)
# [1] 1.171087

编辑,将应用程序添加到提供的数据中:

或者,用您的Infil_Data2

lm(Cal_Vol_cm ~ poly(Sqrt_Time.x, 2, raw = TRUE), data = Infil_Data2) %>%
  tidy() %>%
  slice(3) %>%  # In this case, coefficient "C" is in third row
  pull(estimate)
# [1] 0.01078006