Question

受this的启发问题是最明显的答案是在哪里使用不安全/错误的方式为散点图添加颜色到图例。

最佳答案建议这样做：

data<-iris
plot(data$Sepal.Length, data$Sepal.Width, col=data$Species)
legend(7,4.3,unique(data$Species),col=1:length(data$Species),pch=1)

评论建议使用levels()而不是unique()来控制对legend()的调用中的文本和颜色，但尚不清楚为什么会有帮助。我需要一个更好的解释来信任该代码。

如何编写保证正确着色的代码？

Answer 1

我找到的解决方案是：

data <- iris
# Create a translation table that couple species to color
colorcode = data.frame(
  cbind(colorsMy = c("red", "green", "blue"), species = levels(data$Species)),
  stringsAsFactors = F)
# Make vector with colors for the different points in the scatter
iriscolors = sapply(data$Species,  # Species to colors translation acc to colorcode
                    function(x) colorcode$colorsMy[colorcode$species == x])
# Plot the scatter using the color vector constructed according the colorcode
plot(data$Sepal.Length, data$Sepal.Width, col = iriscolors, pch = 19)
# Since iriscolors according to colorcode, I can use colorcode for the legend
legend("bottomright", legend = colorcode$species, fill = colorcode$colorsMy)

此代码有点笨重，但易于遵循，并在图例中显式构造正确的颜色标签。 “技巧”是创建colorcode变量，该变量用作因子水平（在这种情况下为虹膜种类）和图例颜色之间的转换表。

Answer 2

最佳答案的问题是您不能保证 unique(data$Species) 和 levels(data$Species) 会产生相同的订单。他们在这个例子中（setosa、versicolor、virginica）有，但在你的数据中可能没有。 levels 将用于着色，默认情况下，它们按字母顺序排序，而 unique 将仅按数据框中出现的顺序列出它们。

我将说明两个示例，其中给定的代码会失败，以及如何使用级别来产生一致的结果。

A) 重新排列数据框中行的顺序

data = iris[order(iris$Sepal.Width),]

如果我们现在运行顶部答案中发布的代码，图中的颜色是相同的（左上角为黑色，右上角为绿色），但图例中的分配被交换（称为黑色的分配）杂色而不是setosa，...）。

unique(data$Species)
## [1] versicolor virginica  setosa      # uses these in legend
## Levels: setosa versicolor virginica   # should be using these!

B) 从一个因子级别删除实例而不降低级别

data = iris[iris$Species!="versicolor",]
unique(data$Species)
## [1] setosa    virginica
## Levels: setosa versicolor virginica

如果我们现在运行上面发布的代码，黑色 (setosa) 点被正确标记（因为 setosa 恰好是字母表中的第一个，数据框中的第一个），但图例显示为红色，而点为绿色弗吉尼亚州。

现在可以使用 levels 避免这两个问题：

plot(data$Sepal.Length, data$Sepal.Width, col=data$Species)
legend("topright", legend=levels(data$Species), col=1:nlevels(data$Species), pch=1)

R：根据要素级别的图例颜色

2 个答案: