我有多个县的人口数据,并希望尽量减少每个县的重复拟合后勤增长曲线。
county year pop
lake 1970 69305
lake 1980 104870
lake 1990 152104
lake 2000 210528
lake 2010 297052
marion 1970 69030
marion 1980 122488
marion 1990 194833
marion 2000 258916
marion 2010 331298
seminole 1970 83692
seminole 1980 179752
seminole 1990 287529
seminole 2000 365196
seminole 2010 422718
目前,我正在对每个县进行分类:
lake<-countypop[1:5,2:3]
colnames(lake)<-c("year", "pop")
marion<-countypop[6:10,2:3]
colnames(marion)<-c("year", "pop")
seminole<-countypop[11:15,2:3]
然后使用SSlogis并绘制每个县的曲线,例如:
lake.model <- nls(pop ~ SSlogis(year, phi1, phi2, phi3, data = lake)))
alpha <- coef(lake.model)
plot(pop ~ year, data = lake, main = "Logistic Growth Model of Lake County",
xlab = "Year", ylab = "Population", xlim = c(1920, 2030),ylim=c(0,1000000))
curve(alpha[1]/(1 + exp(-(x - alpha[2])/alpha[3])), add = T, col = "blue")
我有大约60个县,我知道必须有一个更清洁的方法来做到这一点。如何使用apply函数,循环或其他东西来消除代码中的重复?
答案 0 :(得分:2)
试试这个:
pdf("countypop.pdf")
models <- by(countypop, countypop$county, function(x) {
fm <- nls(pop ~ SSlogis(year, phi1, phi2, phi3), data = x)
plot(pop ~ year, x, main = county[1])
lines(fitted(fm) ~ year, x)
fm
})
dev.off()
注意:我们将此用作输入:
countypop <-
structure(list(county = c("lake", "lake", "lake", "lake", "lake",
"marion", "marion", "marion", "marion", "marion", "seminole",
"seminole", "seminole", "seminole", "seminole"), year = c(1970L,
1980L, 1990L, 2000L, 2010L, 1970L, 1980L, 1990L, 2000L, 2010L,
1970L, 1980L, 1990L, 2000L, 2010L), pop = c(69305L, 104870L,
152104L, 210528L, 297052L, 69030L, 122488L, 194833L, 258916L,
331298L, 83692L, 179752L, 287529L, 365196L, 422718L)), .Names = c("county",
"year", "pop"), class = "data.frame", row.names = c(NA, -15L))