Currently, I'm creating a relative abundance ggplot with multiple biological taxa. However, I'd like to add in multiple colored gradients depending on the "phylogeny" of the taxa to delineate which Phylum each "Taxa" belongs to. Essentially, what I have right now is this:
library(ggplot2)
library(scales)
require(reshape2)
require(plyr)
taxdat <- read.table("fig_2.txt", header = TRUE, row.names = 1)
data <- melt(cbind(taxdat, taxa = rownames(taxdat)), id.vars = c('taxa'))
#order factor
data$taxa <- factor(data$taxa, levels=unique(data$taxa))
ggplot(data,aes(x = variable, y = value, fill = taxa)) +
geom_bar(position = "fill",stat = "identity") +
scale_y_continuous(labels = percent_format()) +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.y=element_blank())
Which produces this...
However, I'd like to color code each taxa based on the phylum it belongs to, yet still retain individual gradients to differentiate the taxa from each other. For example, orange hues for "Arthropoda", green hues for "Nematoda", etc. Any help with this would be greatly appreciated.
Thanks, Dean
P.S. Here's the taxdata if you want it:
Abund Phylogeny
Metazoa 13 Metazoa
Arthropoda 3 Arthropoda
Arachnida 3 Arthropoda
Alicorhagia 3 Arthropoda
Araneae 2 Arthropoda
Harpacticoida 1 Arthropoda
Lepidoptera 6 Arthropoda
Oribatida 4 Arthropoda
Gehypochthonius 1 Arthropoda
Coccinellidae 5 Arthropoda
Salticidae 3 Arthropoda
Liochthonius 3 Arthropoda
Paraphidippus 1 Arthropoda
Paucitubulatina 4 Gastrotricha
Chaetonotidae 1 Gastrotricha
Nematoda 30 Nematoda
Chromadorea 5 Nematoda
Dorylaimida 2 Nematoda
Plectidae 1 Nematoda
Prismatolaimus 2 Nematoda
Alaimus 2 Nematoda
Geomonhystera 10 Nematoda
Mesodorylaimus 1 Nematoda
Prodesmodora 1 Nematoda
Tylocephalus 1 Nematoda
Eutardigrada 1 Tardigrada
Parachela 2 Tardigrada
UPDATE: I've altered the taxadata with the phylogeny data, so the above code will not run smoothly if just copied and pasted.
答案 0 :(得分:0)
我会考虑一下这个数据最有效的可视化。颜色的一个问题是,通过一定数量的类别(大约8个),它们作为区分类别的手段变得无效。所以在你的情况下,他们可以为门工作,但不是分类群。另一个考虑因素:有多个因素,你通常需要不止一种分组因子的方法(即而不是颜色)。
定义您希望可视化对象解决的问题,例如&#34;我希望快速查看每个门中的主要分类群&#34;。这是解决该问题的ggplot
建议。我假设您的数据位于数据框data
中,其中包含一个名为Taxa
的列,而不是以行名称显示它们。
library(ggplot2)
ggplot(data, aes(Taxa, Abund)) + geom_col() +
facet_grid(Phylogeny ~ .) + theme_light() +
theme(axis.text.x = element_text(angle = 90))
您可能还会考虑ggplot
以外的工具和方法。例如,另一种可视化分层类别中比例的方法是树形图:
library(treemap)
treemap(data, index = c("Phylogeny", "Taxa"), vSize = "Abund",
vColor = "Abund", palette = "Spectral")