我正在使用ggplot绘制箱形图和晶须图,其中每个数据点也都绘制在顶部。
这是我当前的代码:
df_3UTR_ext_colors <- read_excel(pathname_ext, sheet = "Sheet1",
col_types = c("text", "text", "text",
"numeric", "text", "text"))
# Plotting
allGenes_colors_simple <- ggplot(df_3UTR_ext_colors_simple, aes(Gene, Length) )
allGenes_colors_simple +
geom_boxplot(outlier.shape = NA, aes(fill = Genus), alpha = 0.5) +
scale_fill_brewer(palette="Set1") +
geom_point(aes(fill = Genus),
size = 2, shape = 21, position = position_dodge(width = 0.75))
这是该代码的当前输出:
我想根据我添加到数据框中的十六进制代码分别为每个点着色。
理想情况下,我希望每个点的颜色是其关联框颜色的变体(例如:构成Henipavirus属的所有病毒都应具有不同的红色阴影)。我已经手动完成了此操作,并使用“颜色”列中的十六进制代码来完成此操作,以防万一,这是最简单的方法。
我已经尝试了很多次迭代,但都没有成功。例如,在geom_point()
中,我正在使用aes(fill = Genus)
。如果我代之以aes(fill = Virus)
,结果看起来像这样:
显然这里有很多问题。一是调色板用完了。这很容易修复,所以我不太担心。另一个是,数据点突然不再与其关联的框对齐,它们开始偏离。此外,这也限制了我对每个单独点的颜色的手动控制。
我的感觉是,RColorBrewer有许多更简单的方法可以为每个属分配调色板,从而为病毒提供自己的颜色(实际上,手动浏览调色板的过程为我提供了我手动添加的十六进制代码数据框)。但是,如果我可以让ggplot
根据我手动添加的颜色分别为每个点着色,我不会太担心。
有人有什么建议吗?
> dput(df_3UTR_ext_colors_simple)
structure(list(Genus = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Henipavirus",
"Morbillivirus", "Rubulavirus", "Respirovirus", "Avulavirus",
"Aquaparamyxovirus", "Ferlavirus"), class = "factor"), Virus = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 9L,
9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L,
11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L,
13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 15L, 15L, 15L, 15L, 15L,
15L, 16L, 16L, 16L, 16L, 16L, 16L), .Label = c("HeV", "NiV",
"CedV", "GhV", "MojV", "MeV", "CDV", "FeMV", "MuV", "HPIV-2",
"PIV5", "SeV", "HPIV1", "HPIV3", "APMV-1_NDV", "APMV-3", "AsaPV",
"FDLV"), class = "factor"), Gene = structure(c(1L, 2L, 3L, 4L,
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L,
3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L,
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L,
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L,
3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L,
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), .Label = c("N",
"P", "M", "F", "RBP", "RdRp"), class = "factor"), Length = c(568,
469, 200, 418, 516, 67, 586, 469, 200, 412, 504, 67, 334, 192,
408, 88, 139, 63, 238, 213, 314, 82, 68, 65, 164, 144, 455, 173,
543, 50, 59, 72, 426, 137, 84, 176, 59, 72, 407, 132, 111, 65,
47, 98, 333, 344, 116, 150, 111, 74, 90, 48, 66, 137, 134, 187,
130, 186, 210, 44, 106, 66, 204, 100, 111, 34, 43, 83, 94, 70,
104, 85, 43, 113, 94, 88, 110, 100, 43, 122, 61, 38, 90, 71,
217, 180, 112, 84, 195, 77, 167, 203, 67, 204, 247, 221), Use = c("Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "No", "No", "No", "No", "No", "No", "No", "No",
"No", "No", "No", "No"), Colors = c("#A50F15", "#A50F15", "#A50F15",
"#A50F15", "#A50F15", "#A50F15", "#DE2D26", "#DE2D26", "#DE2D26",
"#DE2D26", "#DE2D26", "#DE2D26", "#FB6A4A", "#FB6A4A", "#FB6A4A",
"#FB6A4A", "#FB6A4A", "#FB6A4A", "#FCAE91", "#FCAE91", "#FCAE91",
"#FCAE91", "#FCAE91", "#FCAE91", "#FEE5D9", "#FEE5D9", "#FEE5D9",
"#FEE5D9", "#FEE5D9", "#FEE5D9", "#3182BD", "#3182BD", "#3182BD",
"#3182BD", "#3182BD", "#3182BD", "#9ECAE1", "#9ECAE1", "#9ECAE1",
"#9ECAE1", "#9ECAE1", "#9ECAE1", "#DEEBF7", "#DEEBF7", "#DEEBF7",
"#DEEBF7", "#DEEBF7", "#DEEBF7", "#31A354", "#31A354", "#31A354",
"#31A354", "#31A354", "#31A354", "#A1D99B", "#A1D99B", "#A1D99B",
"#A1D99B", "#A1D99B", "#A1D99B", "#E5F5E0", "#E5F5E0", "#E5F5E0",
"#E5F5E0", "#E5F5E0", "#E5F5E0", "#756BB1", "#756BB1", "#756BB1",
"#756BB1", "#756BB1", "#756BB1", "#BCBDDC", "#BCBDDC", "#BCBDDC",
"#BCBDDC", "#BCBDDC", "#BCBDDC", "#EFEDF5", "#EFEDF5", "#EFEDF5",
"#EFEDF5", "#EFEDF5", "#EFEDF5", "#E6550D", "#E6550D", "#E6550D",
"#E6550D", "#E6550D", "#E6550D", "#FEE6CE", "#FEE6CE", "#FEE6CE",
"#FEE6CE", "#FEE6CE", "#FEE6CE")), row.names = c(NA, -96L), class = c("tbl_df",
"tbl", "data.frame"))