我正在尝试使用命令abline来拟合一条线,并找到来自两个不同数据源的列之间关系的R平方值。 我的数据操作如下:
df<-data.frame(prop=as.numeric(as.character(prop$factorised column)), spv=column_from_a_matrix[,1]) #align the columns
df<-subset(df, !is.na(df$prop) & df$prop!=0.00) # 'as.numeric(as.character' introduces NA
# values by coercion and get rid of 0's
with(df, plot(prop, spv))
abline(fit <- lm(prop ~ spv, data=df), col='red')
legend("topright", bty="n", legend=paste("R2 is", format(summary(fit)$adj.r.squared, digits=4)))
然而,生产线显然是错误的:
summary(lm(prop ~ spv, data=df))
Call:
lm(formula = prop ~ spv, data = df)
Residuals:
Min 1Q Median 3Q Max
-3.6856794 -0.9907636 -0.3290975 0.6782148 7.7086112
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.78546164 0.07840516 35.52651 < 0.000000000000000222 ***
spv 22.61806073 2.18325279 10.35980 < 0.000000000000000222 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.633367 on 432 degrees of freedom
Multiple R-squared: 0.1989994, Adjusted R-squared: 0.1971452
F-statistic: 107.3254 on 1 and 432 DF, p-value: < 0.00000000000000022204
由于我没有得到任何错误,即使关系显然是错误的,我也看不出我出错的地方......
dput(df)的输出为:
structure(list(prop = c(1.4, 2.5, 2.6, 3.3, 1.63, 0.9, 2.22, 2.43, 0.64, 1.4, 3.1, 0.41.....
spv = c(0.0636986180909399, -0.0596651855579327.....
.Names = c("prop", "spv"),
row.names = c(3L, 4L, 11L, 12L, 13L, 16L, 18L, 19L, 24L ....
我对abline命令的研究刚刚告诉我,我正在做的事情是正确的......有人能指出我正确的方向吗?