将包含元数据的列添加到geom_tile ggplot

时间:2018-03-26 08:32:48

标签: r ggplot2

我有以下数据

id <- 1:80
gyrA <- sample(c(1,0), 80, replace = TRUE)
parC <- sample(c(1,0), 80, replace = TRUE)
marR <- sample(c(1,0), 80, replace = TRUE)
qnrS <- sample(c(1,0), 80, replace = TRUE)
marA <- sample(c(1,0), 80, replace = TRUE)
ydhE <- sample(c(1,0), 80, replace = TRUE)
qnrA <- sample(c(1,0), 80, replace = TRUE)
qnrB <- sample(c(1,0), 80, replace = TRUE)
qnrD <- sample(c(1,0), 80, replace = TRUE)
mcbE <- sample(c(1,0), 80, replace = TRUE)
oqxAB <- sample(c(1,0), 80, replace = TRUE)
species <- sample(c("Wild bird","Pig","Red Fox","Broiler"), 80, replace = TRUE)

test_data <- data.frame(id,species,gyrA,parC,marR,marA,qnrS,qnrA,qnrB,qnrD,ydhE,mcbE,oqxAB)


library(dplyr)

plot_data <- test_data %>%
  gather(key = "gene", value = "value", -id) %>%
  mutate(id = factor(id, levels = unique(id)),
         gene = factor(gene, levels = unique(gene)))

我想创建一个热图,数据中存在/不存在基因。但是,我还想在同一个图中找到一个具有该物种的列。我将所有存在/不存在的列(gyrA,parC等)收集到一列中。

我已设法创建热图,但不包括物种。我希望添加一些列,其中包含我稍后可能获得的与这些样本相关的任何数据。

情节:

ggplot(plot_data, aes(gene, id, fill = value))+
  geom_tile(color = "black")+
  theme_classic()

如何在图中添加具有物种的列,以便它看起来像这样? enter image description here

有没有简单的方法可以做到这一点?如果更容易,是否有可能至少创建一个文本列,说明每行代表哪些物种?

1 个答案:

答案 0 :(得分:1)

修改

根据他/她的评论,我调整了样本数据以反映OP的实际问题。

(proj)tom@neon ~/dev/proj$ type -a python
python is /home/tom/.virtualenvs/proj/bin/python
python is /usr/bin/python

(proj)tom@neon ~/dev/proj$ python -V
Python 3.5.2

(proj)tom@neon ~/dev/proj$ type -a pytest
pytest is /home/tom/.virtualenvs/proj/bin/pytest
pytest is /usr/bin/pytest

(proj)tom@neon ~/dev/proj$ pytest --version
This is pytest version 3.5.0, imported from /home/tom/.virtualenvs/proj/lib/python3.5/site-packages/pytest.py

enter image description here

使用群集colors <- c("#b13da1", "#00b551" , "#fff723" , "#ff0022") plot_data$label <- paste("1 -", as.character(plot_data$species)) plot_data$label[plot_data$value==0] <- "0" ggplot(plot_data, aes(gene, id, fill = label))+ geom_tile(color = "black")+ theme_classic()+ scale_fill_manual(values = c("white", colors), "Value")+ theme( axis.line = element_blank(), axis.ticks = element_blank()) + xlab("Gene") + ylab("id") 以提高可读性:

species

enter image description here

与使用某些变通方法的OP更接近的东西(但我认为结果数字不如第一个变得清晰)。

library(forcats)

ggplot(plot_data, aes(gene, fct_reorder(id, as.numeric(species)), fill = label))+
  geom_tile(color = "black")+
  theme_classic()+
  scale_fill_manual(values = c("white", colors), "Value")+
  theme(
    axis.line = element_blank(),
    axis.ticks = element_blank()) +
  xlab("Gene") + ylab("id")

enter image description here

示例数据

newdata <- plot_data[1:10,]
newdata$gene <- "Species"
newdata$value <- newdata$species
plot_data <- rbind(plot_data, newdata)

plot_data$value <- as.factor(plot_data$value)
levels(plot_data$value) <- c(levels(plot_data$value ), "") # add artificial levels to split the legend into 2 columns
levels(plot_data$value) <- c(levels(plot_data$value ), " ") 
plot_data$value <- factor(plot_data$value, levels(plot_data$value)[c(1,2,7,8,3:6)])
plot_data$gene <- factor(plot_data$gene, levels(plot_data$gene)[c(12, 1:11)])

colors <- c("#b13da1", "#00b551" , "#fff723" , "#ff0022")

ggplot(plot_data, aes(gene, id, fill = value))+
  geom_tile()+
  geom_tile(color = "black",show.legend = F)+
  theme_classic()+
  scale_fill_manual(values = c("#403f3f", "grey","white","white", 
  colors), "Value Species", drop=FALSE)+
  theme(
    axis.line = element_blank(),
    axis.ticks = element_blank()) +
  guides(fill = guide_legend(ncol=2)) +
  xlab("Gene") + ylab("id")+
  scale_x_discrete(position = "top")