我有以下数据集:https://archive.ics.uci.edu/ml/datasets/abalone
我试图绘制整体重量与直径的回归。
数据的散点图显然不是线性函数。 (由于某种原因,我无法附上它。)
考虑二次回归模型。我这样设置:
abalone <- read.csv("abalone.data")
diameter <- abalone$Diameter
diameter2 <- diameter^2
whole <- abalone$Whole.weight
quadraticModel <- lm( whole ~ diameter + diameter2)
这很好,在调用quadraticModel时给出了以下内容:
Call:
lm(formula = whole ~ diameter + diameter2)
Coefficients:
(Intercept) diameter diameter2
0.3477 -3.3555 10.4968
然而,当我画情节时:
abline(quadraticModel)
我收到以下警告:
Warning message:
In abline(quadraticModel) :
only using the first two of 3 regression coefficients
这意味着我得到了一条直线图,这不是我的目标。有人可以向我解释为什么会发生这种情况以及可能的方法吗?我对立方图等也有同样的问题(它们总是只绘制前两个系数。)
答案 0 :(得分:0)
您无法使用abline
绘制拟合的多项式回归。试试这个:
x<-sort(diameter)
y<-quadraticModel$fitted.values[order(diameter)]
lines(x, y)
答案 1 :(得分:0)
我不认为你会产生二次拟合,而是使用直径和平方直径进行线性拟合。试试这个:
library(stats)
df <- read.csv("abalone.data")
var_names <-
c(
"Sex",
"Length",
"Diameter",
"Height",
"Whole_weight",
"Shucked_weight",
"Viscera_weight",
"Shell_weight",
"Rings"
)
colnames(df) <- var_names
fit <- lm(df$Whole_weight ~ poly(df$Diameter, 2))
summary(fit)
diameter <- df$Diameter
predicted_weight <- predict(fit, data.frame(x = diameter))
plot(diameter, predicted_weight)
> summary(fit)
Call:
lm(formula = df$Whole_weight ~ poly(df$Diameter, 2))
Residuals:
Min 1Q Median 3Q Max
-0.66800 -0.06579 -0.00611 0.04590 0.97396
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.828818 0.002054 403.44 <2e-16 ***
poly(df$Diameter, 2)1 29.326043 0.132759 220.90 <2e-16 ***
poly(df$Diameter, 2)2 8.401508 0.132759 63.28 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1328 on 4173 degrees of freedom
Multiple R-squared: 0.9268, Adjusted R-squared: 0.9267
F-statistic: 2.64e+04 on 2 and 4173 DF, p-value: < 2.2e-16