我有一个关于每年约300个观测值(邻里)的邻里和犯罪率的时间序列数据(
Neighbourhood year rate
5001 2009 43.5
5001 2010 34.7
5001 2011 40.8
5002 2009 28.9
5002 2010 33.8
5002 2011 24.4
. . .
. . .
我按组(针对每个街区)应用样条回归,绘制了它们随时间变化的曲线,并为每个观测结果得出6个系数:
models <- dlply(crimedf, "Neighborhood", function(rep)
lm(formula = crimedf$rate ~ bs(crimedf$year, 4 )))
ldply(models, coefficients)
CensusTract (Intercept) bs(rep$year,4)1 bs(rep$year,4)2 bs(rep$year,4)3 bs(rep$year,4)4
[1,] 5001 36.14530 17.502968 7.3978890 13.9133907 5.70852015
[2,] 5002 64.62910 19.849905 -16.5157307 -10.7476461 -20.38942429
[3,] 5003 60.62435 30.698498 3.8573157 10.3343372 2.42415531
[4,] 5004 117.41211 20.563574 -56.1543275 -42.1466082 -70.43030322
[5,] 5005 65.11512 6.628532 -0.1630097 -13.9509698 -32.82296251
[6,] 5006 71.56126 11.982026 -21.9261412 -20.3717788 -39.88968841
[7,] 5007 51.69142 13.757720 -0.9959946 17.3522501 1.03887025
我通过k均值聚类对样条系数进行聚类,最后得到3个聚类
Cluster means:
CensusTract (Intercept) bs(rep$year,4)1 bs(rep$year,4)2 bs(rep$year,4)3 bs(rep$year,4)4
1 5046.074 45.52149 23.9330434 7.38961 7.318444 -0.938248
2 5033.760 63.23948 12.8276297 -14.20825 -15.074422 -23.002576
3 5031.500 113.91280 -0.8379427 -68.82647 -56.071503 -80.903656
我的问题是:
如何绘制这些系数以查看这些聚类的3条曲线?