绘制带有误差的x-y图作为高斯/法线曲线(理想情况下为R)

时间:2013-08-14 16:35:11

标签: r ggplot2 gaussian scatter-plot

我有x-y数据,它们都有+/-错误(每侧都相同)。它的数据类型在x-y方向上具有正态分布。目前我们将其绘制为典型的x-y十字架,或使用geom_rect();但两者都有证明数据代表性的问题。我正在寻找一种解决方案,允许每个x-y数据点表示为某种正常/高斯分布(而不仅仅是+),如下面的粗略草图所示。

x-y plot with normal distributions for both errors

以下是一个示例数据框。

  

结构(清单(年龄= c)(2003L,1999L,1995L,1993L,1993L,1990L,   1988L,1987L,1985L,1984L,1983L,1975L,1974L,1972L,1963L,   1960L,1959L,1957L,1953L,1951L,1951L,1946L,1940L,1936L,   1930L,1927L,1919L,1914L,1906L,1885L,1864L,1842L,1830L,   1810L,1803L,1783L,1762L,1741L,1720L,1699L,1678L,1657L   ),Age_error = c(1L,2L,1L,1L,1L,1L,1L,2L,1L,1L,1L,   4L,2L,2L,2L,3L,5L,3L,3L,4L,6L,4L,8L,5L,7L,5L,10L,   14L,17L,23L,21L,20L,53L,67L,30L,30L,30L,30L,30L,30L,   30L,30L),数值= c(0,0.07,0,0.09,0.02,0.06,-0.02,0.154,   0.05,0.02,-0.03,-0.024,-0.01,-0.06,-0.15,-0.04,0.065,   -0.1,-0.09,-0.02,-0.024,-0.11,-0.081,-0.13,-0.12,-0.07,   -0.16,-0.122,-0.057,-0.18,-0.095,-0.105,-0.23,-0.19,-0.178,   -0.267,-0.26,-0.158,-0.079,-0.218,-0.148,-0.193),Value_error = c(0.17,   0.143,0.18,0.18,0.17,0.19,0.18,0.163,0.19,0.18,0.18,   0.142,0.17,0.18,0.17,0.17,0.152,0.17,0.17,0.17,0.151,   0.17,0.154,0.17,0.18,0.26,0.17,0.144,0.145,0.18,0.153,   0.153,0.17,0.18,0.144,0.155,0.138,0.141,0.157,0.14,0.147,   0.137)),。Name = c(“Age”,“Age_error”,“Value”,“Value_error”),class =“data.frame”,row.names = c(NA,   -42L))

这是我用来获取此数据帧的典型x-y错误图的一种代码。

ggplot() + geom_linerange(data=mydata, aes(y=Value, x=Age, xmin=Age-Age_error, xmax=Age+Age_error, ymin=Value-Value_error, ymax=Value+Value_error)) + geom_errorbarh(data=mydata, aes(y=Value, x=Age, xmin=Age-Age_error, xmax=Age+Age_error, ymin=Value-Value_error, ymax=Value+Value_error)) 

我还没有找到x-y正态分布类型图的功能,可能没有,但是想到有人可能会有一些想法!非常感谢提前。

2 个答案:

答案 0 :(得分:0)

您是否想要将年龄与价值的等值线图作为2d核心密度?

require(MASS)
dens <- with(dat, MASS::kde2d(Age, Value))
str(dens)
#-------------
List of 3
 $ x: num [1:25] 1657 1671 1686 1700 1715 ...
 $ y: num [1:25] -0.267 -0.249 -0.232 -0.214 -0.197 ...
 $ z: num [1:25, 1:25] 0.00152 0.00187 0.00226 0.00267 0.00312 ...
#--------------
# kde2d is designed for contour display: x-vector, y-vector, z-Matrix
 contour(dens)

添加了数据点,以便轮廓图和数据之间的连接更加明显:

 points(dat$Age, dat$Value, cex=0.3, col="red")

enter image description here

答案 1 :(得分:0)

如果你需要每个Age,Value对都有+ ve和-ve错误,那么我认为你可能正在寻找smoothScatter函数。此函数使用颜色方案绘制每个点的密度,颜色方案随着距离点的增加而逐渐消失。

smoothScatter(mydata$Age, mydata$Value)

结果

dot notation and bracket notation