Question

这是一个解释问题的样本数据集：

s <- 
"F V1  V2  P
0 0.5 0.7  0
0 1.5 1.7  1
1 0.7 0.9  0
1 1.7 1.9  1
"
d <- read.delim(textConnection(s), sep="")

我想使用ggplot将此数据绘制在一个图中，以便：

在x轴上，我有P
在Y轴上，我同时拥有V1（三角形）和V2（正方形）
F = 0的点是红色，而F = 1的点是蓝色。

也就是说，我想在数据框中用不同的标记绘制两列，以便每个点的颜色由F定义。

谢谢。

编辑：我相信这不是重复的问题-在提到的答案中数据帧融化了。但是在我融化的情况下，我也丢失了F列，该列定义了颜色，因此解决方案不起作用。

Answer 1

有两个选项，在这里：

由于只有两个值列，因此可以通过分别调用geom_point()来绘制它们。通常不建议这样做，不会产生适当的图例，但会给出快速答案。
ggplot2的推荐方法是将值列从宽格式整形为长格式（因此使用F和P作为id变量，因此颜色指示器F不会丢失）。

1。以宽格式绘制数据

library(ggplot2)
g <- ggplot(d, aes(factor(P), color = factor(F))) + 
  geom_point(aes(y = V1), shape = "triangle") +
  geom_point(aes(y = V2), shape = "square")
g

打磨

g +
  ylab("V1, V2") +
  xlab("P") +
  scale_colour_manual(name = "F", values = c("red", "blue"))

请注意，F和P都被明确地转换为离散变量。

2。以长格式绘制数据

library(reshape2)
# reshape data from wide to long format
long <- melt(d, c("F", "P"))
g <- ggplot(long, aes(factor(P), value, shape = variable, color = factor(F))) + 
  geom_point()
g

经过抛光：

g +
  xlab("P") +
  scale_colour_manual(name = "F", values = c("red", "blue")) +
  scale_shape_manual(values = c("triangle", "square"))

从宽格式转换为长格式时，重要的是要指定哪些变量是id变量，它们将在每一行中重复，哪些是测量变量，它们将构成长格式的value列

所以

melt(d, c("F", "P"))

和

melt(d, measure.vars = c("V1", "V2"))

产生相同的结果：

  F P variable value
1 0 0       V1   0.5
2 0 1       V1   1.5
3 1 0       V1   0.8
4 1 1       V1   1.7
5 0 0       V2   0.7
6 0 1       V2   1.8
7 1 0       V2   0.9
8 1 1       V2   1.9

（出于完整性考虑，data.table的{{1}}版本理解列名（例如melt()）上的模式匹配。）

Answer 2

reshape2::melt可能是tidyr::gather的很好替代。您只需指定要在select中作为dplyr收集的变量，并将其新名称设为key参数即可。 value参数用于对应值的名称。

在这里，不要失去F ：gather(-P, -F, key = "V", vlaue = "value")

    s <- 
    "F V1  V2  P
    0 0.5 0.7  0
    0 1.5 1.7  1
    1 0.7 0.9  0
    1 1.7 1.9  1
    "
    d <- read.delim(textConnection(s), sep="")
    library(tidyverse)
    library(ggplot2)
    d %>%
      rename(f = F) %>% # just not to confuse with FALSE
      gather(-P, -f, key = "V", value = "value") %>% # tidyr::gather
      ggplot(aes(x = P, y = value, shape = V, color = factor(f))) +
      geom_point() +
      geom_line() +
      scale_color_manual(name = "F", values = c("0" = "red", "1" = "blue")) +
      scale_shape_manual(name = "V", values = c("V1" = 2, "V2" = 0))

ggplot：绘制两列数据框

2 个答案:

1。以宽格式绘制数据

2。以长格式绘制数据