我使用来自ACS的县数据在R中创建了一个线性模型。我的数据集中有3140个条目,它们都有相应的fips代码。我正在尝试从我的线性模型中绘制残差图,但我只有3139个残差。有没有人知道在创建一个负责这个的线性模型时是否有R做了什么,以及我如何修复它以便我可以创建这个地图?谢谢!
为了回应检查NAs的建议,我跑了这个:
which(completedata$fipscode == NA)
integer(0)
R代码如果有帮助:
sectorcodes <- read.csv("sectorcodes1.csv") #ruralubrancode, median hh income
sectorcodesdf <- data.frame(sectorcodes)
religion <- read.csv("Religion2.csv")
religiondf <- data.frame(religion)
merge1 <- merge(sectorcodesdf,religiondf, by = c('fipscode'))
merge1df <- data.frame(merge1)
family <- read.csv("censusdataavgfamsize.csv") #avgfamilysize
familydf <- data.frame(family)
merge2 <- merge(merge1df, familydf, by = c('fipscode'))
merge2df <- data.frame(merge2)
gradrate <- read.csv("censusdatahsgrad.csv")
gradratedf <- data.frame(gradrate)
evenmoredata2 <- merge(gradrate,merge2df, by=c("fipscode"))
#write.csv(evenmoredata2, file = "completedataset.csv")
completedata <- read.csv("completedataset.csv")
completedatadf <- data.frame(completedata)
lm8 <- lm(completedatadf$hsgrad ~ completedatadf$averagefamilysize*completedatadf$Rural_urban_continuum_code_2013*completedatadf$TOTADH*completedatadf$Median_Household_Income_2016)
summary(lm8)
library(blscrapeR)
library(RgoogleMaps)
library(choroplethr)
library(acs)
attach(acs)
require(choroplethr)
dataframe1 <- data.frame(completedatadf$fipscode,completedatadf$averagefamilysize)
names(dataframe1) <- c("region","value")
dataframe2 <- data.frame(completedata$fipscode,completedata$hsgrad)
names(dataframe2) <- c("region","value")
residdf <- data.frame(lm8$residuals)
dataframe3 <- data.frame(completedata$fipscode,lm8$residuals)
names(dataframe3) <- c("region","value")
county_choropleth(dataframe1)
county_choropleth(dataframe2)
county_choropleth(dataframe3)
当我尝试运行dataframe3时,错误消息为:
dataframe3 <- data.frame(completedata$fipscode,lm8$residuals)
Error in data.frame(completedata$fipscode, lm8$residuals) :
arguments imply differing number of rows: 3140, 3139
答案 0 :(得分:1)
这可能是由响应中的NA引起的。例如,使用内置BOD
数据框,请注意此示例中有5个残差,但b
中有6个行:
b <- BOD
b[3, 2] <- NA
nrow(b)
## [1] 6
fm <- lm(demand ~ Time, b)
resid(fm)
## 1 2 4 5 6
## -0.3578947 -0.2657895 1.6184211 -0.6894737 -0.3052632
我们可以在运行na.action = na.exclude
时指定lm
来解决这个问题。请注意,现在有6个残差,额外的一个是NA。
fm <- lm(demand ~ Time, b, na.action = na.exclude)
resid(fm)
## 1 2 3 4 5 6
## -0.3578947 -0.2657895 NA 1.6184211 -0.6894737 -0.3052632
答案 1 :(得分:0)
尝试
data.frame(na.omit(completedata$fipscode), lm8$residuals)
您的数据可能有NA
个值。