拟合数据集中的多个逻辑增长曲线

时间:2017-09-17 23:59:02

标签: r nls

我有多个县的人口数据,并希望尽量减少每个县的重复拟合后勤增长曲线。

county      year    pop
lake        1970    69305
lake        1980    104870
lake        1990    152104
lake        2000    210528
lake        2010    297052
marion      1970    69030
marion      1980    122488
marion      1990    194833
marion      2000    258916
marion      2010    331298
seminole    1970    83692
seminole    1980    179752
seminole    1990    287529
seminole    2000    365196
seminole    2010    422718

目前,我正在对每个县进行分类:

lake<-countypop[1:5,2:3]
colnames(lake)<-c("year", "pop")
marion<-countypop[6:10,2:3]
colnames(marion)<-c("year", "pop")
seminole<-countypop[11:15,2:3]

然后使用SSlogis并绘制每个县的曲线,例如:

lake.model <- nls(pop ~ SSlogis(year, phi1, phi2, phi3, data = lake)))
alpha <- coef(lake.model)
plot(pop ~ year, data = lake, main = "Logistic Growth Model of Lake County", 
xlab = "Year", ylab = "Population", xlim = c(1920, 2030),ylim=c(0,1000000))  
curve(alpha[1]/(1 + exp(-(x - alpha[2])/alpha[3])), add = T, col = "blue") 

我有大约60个县,我知道必须有一个更清洁的方法来做到这一点。如何使用apply函数,循环或其他东西来消除代码中的重复?

1 个答案:

答案 0 :(得分:2)

试试这个:

pdf("countypop.pdf")
models <- by(countypop, countypop$county, function(x) {
  fm <- nls(pop ~ SSlogis(year, phi1, phi2, phi3), data = x)
  plot(pop ~ year, x, main = county[1])
  lines(fitted(fm) ~ year, x)
  fm
})
dev.off()

注意:我们将此用作输入:

countypop <- 
structure(list(county = c("lake", "lake", "lake", "lake", "lake", 
"marion", "marion", "marion", "marion", "marion", "seminole", 
"seminole", "seminole", "seminole", "seminole"), year = c(1970L, 
1980L, 1990L, 2000L, 2010L, 1970L, 1980L, 1990L, 2000L, 2010L, 
1970L, 1980L, 1990L, 2000L, 2010L), pop = c(69305L, 104870L, 
152104L, 210528L, 297052L, 69030L, 122488L, 194833L, 258916L, 
331298L, 83692L, 179752L, 287529L, 365196L, 422718L)), .Names = c("county", 
"year", "pop"), class = "data.frame", row.names = c(NA, -15L))