Question

我有数据。我试图记录我决定使用负二项分布而不是泊松（我在lme4中无法得到准泊松对象）并且我遇到了图形问题（向量附加到帖子的末尾）。我一直在尝试实现distplot()函数来告知我决定建模的分布：

这是结果变量（医生数）： plot(d1.2$totalmds) physician count

哪个看起来像poisson
enter image description here
但是均值和方差并不接近（方差加倍了两个极值;但仍然没有接近均值）

> var(d1.2$totalmds, na.rm = T)
[1] 114240.7
> mean(d1.2$totalmds, na.rm = T)
[1] 89.3121

我的结果部分是由人口驱动的，所以我在初步模型中使用总人口作为抵消变量。据我所知，这将结果除以offset变量的自然对数，因此totalmds / log（poptotal）基本上是被建模的。看起来像是：
enter image description here

但是当我尝试使用以下方式对此进行建模时：
情节1：distplot(x = d1.2$totalmds, type = "poisson")
情节2：distplot(x = d1.2$totalmds, type = "nbinomial") # looks way off enter image description here

情节3：plot(fitdist(data = d1.2$totalmds, distr = "pois", method = "mle"))
情节4：plot(fitdist(data = d1.2$totalmds, distr = "nbinom", method = "mle")) # throws warnings enter image description here

情节5：qqcomp(fitdist(data = d1.2$totalmds, distr = "pois", method = "mle"))
情节6：qqcomp(fitdist(data = d1.2$totalmds, distr = "nbinom", method = "mle")) # throws warnings enter image description here

有没有人建议为什么以下情节看起来有点棘手/不一致？

正如我所提到的，我在实际分析中使用另一个变量作为偏移变量，如果这会产生影响。

这是矢量： https://gist.github.com/timothyslau/f95a777b713eb33a2fe6

我很确定NB比var(d1.2$totalmds)/mean(d1.2$totalmds) # variance-to-mean ratio (VMR) > 1

更好

但如果NB合适，那么情节应该看起来更干净（我想，除非我对这些绘图功能/包做错了。）

绘制计数分布displot（）

0 个答案: