stat_density2d ggplot2不生成轮廓

时间:2016-10-10 22:09:31

标签: r ggplot2

我正在尝试在ggplot2中的散点图上生成一个二维密度图。

我有以下工作代码:

plt<-ggplot(data=for_plot,aes(x=X, y=Y))+ 
  stat_density2d(aes(fill=..level..,alpha=..level..),geom='polygon',colour='black') + 
  scale_fill_continuous(low="green",high="red") +
  guides(alpha="none") +
  ylim(0.5,max(shortest_path_list$shortest_path)) +
  geom_point()

当我使用此数据集运行代码时:

> for_plot[sample(nrow(for_plot), 20), ]
    Y   X
 1: 2 110182.549
 2: 3  95202.283
 3: 2  91557.371
 4: 1   6730.598
 5: 1   7396.081
 6: 1  13939.701
 7: 2   9767.561
 8: 3 101597.449
 9: 2  99368.467
10: 3 102024.722
11: 3  90491.076
12: 3  81337.624
13: 1   5956.710
14: 3  95160.149
15: 3  89981.055
16: 1   8823.615
17: 1  10717.879
18: 2  11463.036
19: 2   3864.292
20: 2  10351.874

它工作正常,并给我以下输出: enter image description here

请注意,我的Y是离散的,X是连续的,所以情节很好。

但是,当我使用此数据集时:

> for_plot[sample(nrow(for_plot), 20), ]
    Y   X
 1: 1   9897.476
 2: 2   2350.191
 3: 1  13911.780
 4: 1  98885.336
 5: 1  94776.873
 6: 1 102804.832
 7: 1  99956.988
 8: 1  13941.653
 9: 1   9246.795
10: 1  13152.775
11: 1 113325.680
12: 1  82263.657
13: 1  91108.347
14: 1   8823.797
15: 1  11057.255
16: 1  99150.825
17: 2   7312.730
18: 2   6476.152
19: 1 113534.588
20: 1  91311.834 

我收到以下错误和情节:

Warning message:
Computation failed in `stat_density2d()`:
bandwidths must be strictly positive

enter image description here

我知道导致此错误的方法之一通常是X方向或Y方向没有变化。但是,在这种情况下,似乎存在类似于第一种情况的变化。因此,我不理解是什么让第一个场景发挥作用,但第二个场景失败了。是否有解决第二种情况下轮廓的工作?

以下是Flick先生建议的具有最小可重复性示例的2个场景:

情景1(情节有效):

set.seed(100)
> for_plot<-dput(for_plot[sample(nrow(for_plot), 20), ])
structure(list(Y = c(2, 2, 3, 1, 2, 
3, 3, 3, 2, 1, 3, 2, 2, 3, 1, 3, 2, 3, 2, 1), X = c(96649.7975713206, 
104758.02495167, 93351.5907987183, 5535.8146932624, 99480.6016841293, 
113103.505637801, 90445.3465777551, 81903.811792781, 106832.148472597, 
6576.45291001145, 99368.9134426028, 111130.390217174, 9471.82883910966, 
102087.415882298, 5657.05900168211, 107688.549964059, 103669.855375872, 
94121.8586312176, 1573.00051813297, 7394.05750749363)), .Names = c("Y", "X"), class = c("data.table", 
"data.frame"), row.names = c(NA, -20L), .internal.selfref = <pointer: 0x00000000065c0788>)

enter image description here

场景2(图表未产生所需的输出):

> for_plot<-dput(for_plot[sample(nrow(for_plot), 20), ])
structure(list(Y = c(1, 
    1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 2), 
    X = c(96925.0119740431, 98869.1560687514, 99434.7995468473, 
    9123.65901167288, 111471.920587976, 109448.280478224, 6678.04323546572, 
    98309.4525934759, 91311.834287723, 86616.727265815, 101009.644050382, 
    7396.08053430818, 102517.086739334, 11504.3148787722, 9471.82883910966, 
    15427.4786153589, 96385.4989659007, 2249.38197350042, 91425.5491534976, 
    9303.7114788096)), .Names = c("Y", 
"X"), class = c("data.table", "data.frame"), row.names = c(NA, 
-20L), .internal.selfref = <pointer: 0x00000000065c0788>)

错误:

 Warning message:
    Computation failed in `stat_density2d()`:
    bandwidths must be strictly positive

enter image description here

更新

让内核工作的一种方法是在Y变量中添加一些随机噪声,使方​​差不再为0.

#Add variability for kernel density
rand_noise<-runif(nrow(for_plot), -0.1, 0.1)
for_plot$Y_noise<-for_plot$Y+rand_noise

虽然错误消失并且内核已经生成,但它们并不像场景1那样漂亮和统一: enter image description here

正如我在评论中提到的,真正令我困惑的是为什么我默认情况下总是工作,而方案2默认情况下从不起作用。我已尝试使用不同的数据子集来验证这一点。方案1和方案2中的数据性质相同。

0 个答案:

没有答案