我的问题更多的是提高我的编码技巧而不是解决问题,因为我能够找到解决方案,但我发现它并不优雅。
我正在处理发布here的更复杂版本。我正在运行多个线性回归,我想将所有系数中的系数导出到单个csv文件中。我能够使用this信息生成所有系数的列表并将其转换为数据帧列表。我的数据框列表如下所示:
> coef.df
[[1]]
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.08670899 0.357377 -0.2426261 0.8082950694
Var.0.0.Type.4 22.46262205 5.935317 3.7845698 0.0001539747
[[2]]
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1682616 0.3590799 -0.4685911 6.393619e-01
Var.0.5.Type.4 15.4974199 3.8693290 4.0051957 6.196616e-05
[[3]]
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.1832488 0.3532577 -0.5187397 6.039423e-01
Var.1.0.Type.4 10.1225605 2.4475064 4.1358668 3.536172e-05
等等。
当我尝试将此列表简单地转换为csv文件时,我搞乱了列名(所有" Intercept"术语添加了一个数字)。
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.08670899 0.3573770 -0.24262609 8.082951e-01
Deg.In.0.0.INS.4 22.46262205 5.9353171 3.78456983 1.539747e-04
(Intercept)1 -0.16826164 0.3590799 -0.46859114 6.393619e-01
Deg.In.0.5.INS.4 15.49741993 3.8693290 4.00519568 6.196616e-05
(Intercept)2 -0.18324877 0.3532577 -0.51873968 6.039423e-01
Deg.In.1.0.INS.4 10.12256045 2.4475064 4.13586682 3.536172e-05
(Intercept)3 -0.14188918 0.3426645 -0.41407607 6.788184e-01
Deg.In.1.5.INS.4 6.32348365 1.5164421 4.16994719 3.046702e-05
我知道行必须具有唯一的名称,我想使用每个模型的第二个系数的名称来自定义它们。我想要做的是创建一个csv文件,该文件将以下列格式包含所有信息,并调整行名称以考虑给定拦截的变量:
Estimate Std. Error z value Pr(>|z|)
(Intercept.0.0.Type.4) -0.0867089 0.357377 -0.2426261 0.8082950694
Var.0.0.Type.4 22.4626220 5.935317 3.7845698 0.0001539747
(Intercept.0.5.Type.4) -0.1682616 0.359079 -0.4685911 6.393619e-01
Var.0.5.Type.4 15.4974199 3.869329 4.0051957 6.196616e-05
(Intercept.1.0.Type.4) -0.1832488 0.353257 -0.5187397 6.039423e-01
Var.1.0.Type.4 10.1225605 2.447506 4.1358668 3.536172e-05
我没有太多操作部分字符串替换的经验,虽然我能够这样做,但我认为我的代码不是最直接的。以下是我能够获得此结果的方法:
#I created a vector containing all row names
df.names <- unlist(lapply(coef.df,rownames))
> df.names
[1] "(Intercept)" "Var.0.0.INS.4" "(Intercept)" "Var.0.5.INS.4"
[5] "(Intercept)" "Var.1.0.INS.4" "(Intercept)" "Var.1.5.INS.4"
[9] "(Intercept)" "Var.0.0.INS.5" "(Intercept)" "Var.0.5.INS.5"
[13] "(Intercept)" "Var.1.0.INS.5" "(Intercept)" "Var.1.5.INS.5"
#I created a vector with all "(Intercept)" elements from df.names
inter.lm <- df.names[c(TRUE, FALSE)]
> inter.lm
[1] "(Intercept)" "(Intercept)" "(Intercept)" "(Intercept)" "(Intercept)"
[6] "(Intercept)" "(Intercept)" "(Intercept)"
#I created a vector with all remaining elements from df.names
var.lm <- df.names[c(FALSE,TRUE)] coefficients
> var.lm
[1] "Var.0.0.Type.4" "Var.0.5.Type.4" "Var.1.0.Type.4" "Var.1.5.Type.4"
[5] "Var.0.0.Type.5" "Var.0.5.Type.5" "Var.1.0.Type.5" "Var.1.5.Type.5"
#I removed the "Var" part from all elements in var.lm
var.temp <- gsub("Var(.*)", "\\1", var.lm)
> var.temp
[1] ".0.0.Type.4" ".0.5.Type.4" ".1.0.Type.4" ".1.5.Type.4" ".0.0.Type.5"
[6] ".0.5.Type.5" ".1.0.Type.5" ".1.5.Type.5"
#I removed the ")" part from all elements in inter.lm
inter.temp <- gsub("\\)", "", inter.lm)
> inter.temp
[1] "(Intercept" "(Intercept" "(Intercept" "(Intercept" "(Intercept"
[6] "(Intercept" "(Intercept" "(Intercept"
#I pasted together vectors inter.tepm and var.temp to get the required names
inter.new <- paste(inter.temp,var.temp,")",sep="")
> inter.new
[1] "(Intercept.0.0.Type.4)" "(Intercept.0.5.Type.4)" "(Intercept.1.0.Type.4)"
[4] "(Intercept.1.5.Type.4)" "(Intercept.0.0.Type.5)" "(Intercept.0.5.Type.5)"
[7] "(Intercept.1.0.Type.5)" "(Intercept.1.5.Type.5)"
#I merged the inter.new and var.lm vectors to get the correct naming
df.names <- c(rbind(inter.new, var.lm))
> df.names
[1] "(Intercept.0.0.Type.4)" "Deg.In.0.0.Type.4"
[3] "(Intercept.0.5.Type.4)" "Deg.In.0.5.Type.4"
[5] "(Intercept.1.0.Type.4)" "Deg.In.1.0.Type.4"
[7] "(Intercept.1.5.Type.4)" "Deg.In.1.5.Type.4"
[9] "(Intercept.0.0.INS.5)" "Deg.In.0.0.INS.5"
[11] "(Intercept.0.5.INS.5)" "Deg.In.0.5.INS.5"
[13] "(Intercept.1.0.INS.5)" "Deg.In.1.0.INS.5"
[15] "(Intercept.1.5.INS.5)" "Deg.In.1.5.INS.5"
#Finally I changed the row names
rownames(final.df) <- df.names
是否有更简单/更短的方法来获取我想要的名字?