我想将lm()应用于按主题分组的观察,但无法解决sapply语法问题。最后,我想要一个数据帧,每个主题有1行,截距和斜率(即行数:subj,lm $ coefficients [1] lm $ coefficients [2])
set.seed(1)
subj <- rep(c("a","b","c"), 4) # 4 observations each on 3 experimental subjects
ind <- rnorm(12) #12 random numbers, the independent variable, the x axis
dep <- rnorm(12) + .5 #12 random numbers, the dependent variable, the y axis
df <- data.frame(subj=subj, ind=ind, dep=dep)
s <- (split(df,subj)) # create a list of observations by subject
我可以从s中提取一组观察结果,制作数据帧,然后得到我想要的结果:
df2 <- as.data.frame(s[1])
df2
lm1 <- lm(df2$a.dep ~ df2$a.ind)
lm1$coefficients[1]
lm1$coefficients[2]
我无法循环遍历s的所有元素并将数据转换为我想要的最终格式:
lm.list <- sapply(s, FUN= function(x)
(lm(x[ ,"dep"] ~ x[,"ind"])))
a <-as.data.frame(lm.list)
我觉得我需要对下面的结构进行某种转置;列(a,b,c)是我想要的行,但是t(a)不起作用。
head(a)
a
coefficients 0.1233519, 0.4610505
residuals 0.4471916, -0.3060402, 0.4460895, -0.5872409
effects -0.6325478, 0.6332422, 0.5343949, -0.7429069
rank 2
fitted.values 0.74977179, 0.09854505, -0.05843569, 0.47521446
assign 0, 1
b
coefficients 1.1220840, 0.2024222
residuals -0.04461432, 0.02124541, 0.27103003, -0.24766112
effects -2.0717363, 0.2228309, 0.2902311, -0.2302195
rank 2
fitted.values 1.1012775, 0.8433366, 1.1100777, 1.0887808
assign 0, 1
c
coefficients 0.2982019, 0.1900459
residuals -0.5606330, 1.0491990, 0.3908486, -0.8794147
effects -0.6742600, 0.2271767, 1.1273566, -1.0345665
rank 2
fitted.values 0.3718773, 0.2193339, 0.5072572, 0.2500516
assign 0, 1
答案 0 :(得分:4)
听起来,这可能是你想要做的事情:
sapply(s, FUN= function(x)
lm(x[ ,"dep"] ~ x[,"ind"])$coefficients[c(1, 2)])
# a b c
# (Intercept) 0.71379430 -0.6817331 0.5717372
# x[, "ind"] 0.07125591 1.1452096 -1.0303726
我看到它注意到,一般来说,如果你分裂然后使用s/lapply
,你通常可以直接跳到by
并跳过split
步骤:< / p>
do.call(rbind,
by(data = df, INDICES=df$subj, FUN=function(x)
lm(x[, "dep"] ~ x[, "ind"])$coefficients[c(1, 2)]))
# (Intercept) x[, "ind"]
# a 0.7137943 0.07125591
# b -0.6817331 1.14520962
# c 0.5717372 -1.03037257
或者,您可以使用其中一个允许您更方便地进行此类计算的软件包,例如“data.table”:
library(data.table)
DT <- data.table(df)
DT[, list(Int = lm(dep ~ ind)$coefficients[1],
Slo = lm(dep ~ ind)$coefficients[2]), by = subj]
# subj Int Slo
# 1: a 0.7137943 0.07125591
# 2: b -0.6817331 1.14520962
# 3: c 0.5717372 -1.03037257
答案 1 :(得分:3)
nlme::lmList
怎么样?
library(nlme)
coef(lmList(dep~ind|subj,df))
## (Intercept) ind
## a 0.7137943 0.07125591
## b -0.6817331 1.14520962
## c 0.5717372 -1.03037257
如果需要,您可以转置它。