使用预先选择的颜色分别在ggplot中的ggplot中的每个数据点和晶须图上着色

时间:2019-01-31 22:42:12

标签: r ggplot2

我正在使用ggplot绘制箱形图和晶须图,其中每个数据点也都绘制在顶部。

这是我当前的代码:

df_3UTR_ext_colors <- read_excel(pathname_ext, sheet = "Sheet1", 
                                 col_types = c("text", "text", "text", 
                                               "numeric", "text", "text"))

# Plotting
allGenes_colors_simple <- ggplot(df_3UTR_ext_colors_simple, aes(Gene, Length) )

allGenes_colors_simple + 
  geom_boxplot(outlier.shape = NA, aes(fill = Genus), alpha = 0.5) +
  scale_fill_brewer(palette="Set1") +
  geom_point(aes(fill = Genus), 
             size = 2, shape = 21, position = position_dodge(width = 0.75)) 

这是该代码的当前输出:

Output for above code

我想根据我添加到数据框中的十六进制代码分别为每个点着色

理想情况下,我希望每个点的颜色是其关联框颜色的变体(例如:构成Henipavirus属的所有病毒都应具有不同的红色阴影)。我已经手动完成了此操作,并使用“颜色”列中的十六进制代码来完成此操作,以防万一,这是最简单的方法。

我已经尝试了很多次迭代,但都没有成功。例如,在geom_point()中,我正在使用aes(fill = Genus)。如果我代之以aes(fill = Virus),结果看起来像这样:

Switching to aes(fill = Virus) within geom_point().

显然这里有很多问题。一是调色板用完了。这很容易修复,所以我不太担心。另一个是,数据点突然不再与其关联的框对齐,它们开始偏离。此外,这也限制了我对每个单独点的颜色的手动控制。

我的感觉是,RColorBrewer有许多更简单的方法可以为每个属分配调色板,从而为病毒提供自己的颜色(实际上,手动浏览调色板的过程为我提供了我手动添加的十六进制代码数据框)。但是,如果我可以让ggplot根据我手动添加的颜色分别为每个点着色,我不会太担心。

有人有什么建议吗?

> dput(df_3UTR_ext_colors_simple)
structure(list(Genus = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Henipavirus", 
"Morbillivirus", "Rubulavirus", "Respirovirus", "Avulavirus", 
"Aquaparamyxovirus", "Ferlavirus"), class = "factor"), Virus = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 
6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 
9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 
11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 
13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 15L, 15L, 15L, 15L, 15L, 
15L, 16L, 16L, 16L, 16L, 16L, 16L), .Label = c("HeV", "NiV", 
"CedV", "GhV", "MojV", "MeV", "CDV", "FeMV", "MuV", "HPIV-2", 
"PIV5", "SeV", "HPIV1", "HPIV3", "APMV-1_NDV", "APMV-3", "AsaPV", 
"FDLV"), class = "factor"), Gene = structure(c(1L, 2L, 3L, 4L, 
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 
3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 
3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), .Label = c("N", 
"P", "M", "F", "RBP", "RdRp"), class = "factor"), Length = c(568, 
469, 200, 418, 516, 67, 586, 469, 200, 412, 504, 67, 334, 192, 
408, 88, 139, 63, 238, 213, 314, 82, 68, 65, 164, 144, 455, 173, 
543, 50, 59, 72, 426, 137, 84, 176, 59, 72, 407, 132, 111, 65, 
47, 98, 333, 344, 116, 150, 111, 74, 90, 48, 66, 137, 134, 187, 
130, 186, 210, 44, 106, 66, 204, 100, 111, 34, 43, 83, 94, 70, 
104, 85, 43, 113, 94, 88, 110, 100, 43, 122, 61, 38, 90, 71, 
217, 180, 112, 84, 195, 77, 167, 203, 67, 204, 247, 221), Use = c("Yes", 
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", 
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", 
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", 
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", 
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", 
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", 
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", 
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", 
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", 
"Yes", "Yes", "No", "No", "No", "No", "No", "No", "No", "No", 
"No", "No", "No", "No"), Colors = c("#A50F15", "#A50F15", "#A50F15", 
"#A50F15", "#A50F15", "#A50F15", "#DE2D26", "#DE2D26", "#DE2D26", 
"#DE2D26", "#DE2D26", "#DE2D26", "#FB6A4A", "#FB6A4A", "#FB6A4A", 
"#FB6A4A", "#FB6A4A", "#FB6A4A", "#FCAE91", "#FCAE91", "#FCAE91", 
"#FCAE91", "#FCAE91", "#FCAE91", "#FEE5D9", "#FEE5D9", "#FEE5D9", 
"#FEE5D9", "#FEE5D9", "#FEE5D9", "#3182BD", "#3182BD", "#3182BD", 
"#3182BD", "#3182BD", "#3182BD", "#9ECAE1", "#9ECAE1", "#9ECAE1", 
"#9ECAE1", "#9ECAE1", "#9ECAE1", "#DEEBF7", "#DEEBF7", "#DEEBF7", 
"#DEEBF7", "#DEEBF7", "#DEEBF7", "#31A354", "#31A354", "#31A354", 
"#31A354", "#31A354", "#31A354", "#A1D99B", "#A1D99B", "#A1D99B", 
"#A1D99B", "#A1D99B", "#A1D99B", "#E5F5E0", "#E5F5E0", "#E5F5E0", 
"#E5F5E0", "#E5F5E0", "#E5F5E0", "#756BB1", "#756BB1", "#756BB1", 
"#756BB1", "#756BB1", "#756BB1", "#BCBDDC", "#BCBDDC", "#BCBDDC", 
"#BCBDDC", "#BCBDDC", "#BCBDDC", "#EFEDF5", "#EFEDF5", "#EFEDF5", 
"#EFEDF5", "#EFEDF5", "#EFEDF5", "#E6550D", "#E6550D", "#E6550D", 
"#E6550D", "#E6550D", "#E6550D", "#FEE6CE", "#FEE6CE", "#FEE6CE", 
"#FEE6CE", "#FEE6CE", "#FEE6CE")), row.names = c(NA, -96L), class = c("tbl_df", 
"tbl", "data.frame"))

0 个答案:

没有答案