在R中绘制Lorenz曲线

时间:2016-03-28 18:19:18

标签: r

我想绘制一个Lorenz curve并计算基尼指数,目的是确定前20%受感染最多的宿主支持多少寄生虫。

这是我的数据集:

每位寄主的寄生虫数量:

parasites = c(0,1,2,3,4,5,6,7,8,9,10)

与上面给出的每种寄生虫数量相关的宿主数量:

hosts = c(18,20,28,19,16,10,3,1,0,0,0)

代表洛伦兹曲线:

我手动计算了寄生虫和寄主的累积百分比:

cumul_parasites <- cumsum(parasites)/max(cumsum(parasites))
cumul_hosts <- cumsum(hosts)/max(cumsum(hosts))
plot(cumul_hosts, cumul_parasites, type= "l")

enter image description here

我还测试了函数Lc(包ineq):

Lc.p <- Lc(parasites,n=hosts)
plot(Lc.p)

enter image description here

为什么两条曲线(手动和函数Lc)不同?

1 个答案:

答案 0 :(得分:2)

2个图表不同,因为当您计算累积百分比(度数)时,您必须将其乘以频率。

正确的解决方案是:

parasites = c(0,1,2,3,4,5,6,7,8,9,10)
hosts = c(18,20,28,19,16,10,3,1,0,0,0)
cumul_parasites <- cumsum(parasites*hosts)/max(cumsum(parasites*hosts))
cumul_hosts <- cumsum(hosts)/max(cumsum(hosts))
plot(cumul_hosts, cumul_parasites, type= "l")
lines(cumul_hosts, cumul_parasites,col = 2, lwd = 2, type = "p")
legend("topleft", c('My calc', 'LC'), col = 1:2, lty = 1, box.col = 1)

这恰好符合Lc计算。

Lc and my calculation comparison