如何使用分类变量为geom_path着色

时间:2018-07-25 15:57:25

标签: r ggplot2

我有一个可能很简单的问题,但我无法弄清楚。我正在使用ggplot2(特别是geom_path())制作图形。路径根据分类变量intersects进行着色,如果路径穿过某个多边形,则该变量的值为TRUE,否则为FALSE(我将group分配为= 1,因此该路径不会按该变量分组)。

它几乎按照我的意愿工作,除了将颜色应用于后面的路径段而不是前面的段。例如,如果观测值i = TRUE,且i + 1 = FALSE,则结果路径在位置i,i + 1之间的颜色为TRUE。在位置i + 1,i + 2之间为FALSE。 我希望位置i-1(i-1)之间的路径颜色为TRUE,而位置i-1(i + 1)之间的路径为FALSE。

# Create polygon.
boundary_x <- c(640343.419, 640341.452, 640339.242, 640337.471, 640339.538, 640341.603)
boundary_y <- c(4858742.348, 4858733.404, 4858722.512, 4858722.853, 4858732.737, 4858742.649)
boundary <- data.frame(x = boundary_x, y = boundary_y)

# Sample data
x <- c(640338.007929366, 640338.077929366, 640338.857929366, 640338.867929366, 640338.459933366, 640338.407929366, 640338.174617366, 640338.139168366, 640338.070599366, 640337.747929366, 640337.847929366, 640338.439430366, 640338.777929366, 640338.877929366, 640339.444178366, 640339.557929366, 640340.247929366, 640340.927929366, 640340.977929366, 640341.107929366, 640341.157929366, 640341.427929366, 640341.477929366, 640341.807929366, 640341.847929366, 640342.427929366, 640342.642404366, 640342.867436366, 640342.878517366, 640343.116330366, 640343.097929366, 640343.007929366, 640342.387929366, 640341.929667366, 640341.837929366, 640339.927929366, 640339.847929366, 640336.427929366, 640335.717929366, 640335.057929366, 640334.967929366, 640334.681813366, 640334.208384366, 640334.172648366, 640334.417929366, 640334.587929366, 640334.777929366, 640334.987929366, 640334.925775366, 640338.257929366, 640338.187929366, 640338.057929366, 640338.077929366, 640338.077929366, 640340.200274366, 640341.037929366, 640341.114123366, 640341.187929366, 640341.237929366)
y <- c(4858731.28088173, 4858731.24088173, 4858730.80088173, 4858730.79088173, 4858728.57674273, 4858728.30088173, 4858727.05816773, 4858726.86768973, 4858726.36255673, 4858722.41088173, 4858722.03088173, 4858721.55321173, 4858721.29088173, 4858721.27088173, 4858721.16125073, 4858721.13088173, 4858721.06088173, 4858720.89088173, 4858720.90088173, 4858720.86088173, 4858720.85088173, 4858720.83088173, 4858720.84088173, 4858721.10088173, 4858721.14088173, 4858722.17088173, 4858722.50853873, 4858722.94242373, 4858722.98987973, 4858725.39572673, 4858725.44088173, 4858725.57088173, 4858725.69088173, 4858725.44266973, 4858725.40088173, 4858721.90088173, 4858721.81088173, 4858721.76088173, 4858721.93088173, 4858722.11088173, 4858722.18088173, 4858722.67533273, 4858723.32189973, 4858723.40558473, 4858727.23088173, 4858727.71088173, 4858728.14088173, 4858728.61088173, 4858730.42873273, 4858728.23088173, 4858727.62088173, 4858726.41088173, 4858726.32088173, 4858726.32088173, 4858726.02508273, 4858726.13088173, 4858726.13140073, 4858726.19088173, 4858726.23088173)
intersects <- c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE)
df <- data.frame(x = x, y = y, intersects = intersects)

# Plot
ggplot() + 
  geom_polygon(data = boundary, aes(x, y)) + 
  geom_path(data = df, aes(x, y, col = intersects, group = 1)) + 
  geom_point(data = df, aes(x, y, col = intersects)) + 
  coord_cartesian(xlim = c(640334, 640343), ylim = c(4858721, 4858731)) 

查看图时,您将看到蓝色的线段代表相交= TRUE,并且它们在路径穿过多边形后立即出现。可以这么说,我想将颜色向后移,所以实际上穿过多边形的线段是彩色的。

我在这里很陌生,没有足够的声誉来发布图片。对不起!

1 个答案:

答案 0 :(得分:0)

这里的问题是(通过intersects列)将颜色分配给点,而不是实际包含颜色的线。 ggplot2认为每个点都是线的起点,然后将其着色,而您将它们视为线的终点。

当然,您可以更改列intersection的定义,以配合ggplot2处理情况的方式。另外,您可以在lead()包中使用dplyr创建图时修改列:

ggplot() + 
  geom_polygon(data = boundary, aes(x, y)) + 
  geom_path(data = df, aes(x, y, col = dplyr::lead(intersects, default = FALSE), group = 1)) + 
  geom_point(data = df, aes(x, y, col = intersects)) + 
  coord_cartesian(xlim = c(640334, 640343), ylim = c(4858721, 4858731)) +
  labs(col = "intersects")

enter image description here

函数lead()只需获取一个向量并将其内容向左移动一个元素:

dplyr::lead(1:3)
## [1]  2  3 NA

很明显,该函数不知道如何处理向量的最后一个元素,但是您可以提供所需的值:

dplyr::lead(1:3, default = 7)
## [1] 2 3 7

还有一个类似的功能dplyr::lag(),它在另一个方向上移动。