Plotting Bacteria according to Food Groups & Abundance in R

时间:2016-10-09 15:48:58

标签: r plot ggplot2

I have a dataframe that includes four bacteria types: R, B, P, Bi - this is in variable.x

value.y is their abundance and variable.y is various groups they are in.

I would like to plot them according to their food categories: "FiberCategory", "FruitCategory", "VegetablesCategory" & "WholegrainCategory." I have made 4 separate files that have the as such:

Sample   Bacteria   Abundance   Category Level
30841102    R       0.005293192 1        Low
30841102    P       0.000002570 1        Low
30841102    B       0.005813275 1        Low
30841102    Bi      0.000000000 1        Low
49812105    R       0.003298709 1        Low
49812105    P       0.000000855 1        Low
49812105    B       0.131147541 1        Low
49812105    Bi      0.000350086 1        Low

So, I would like a bar plot of how much of each bacteria is in each category. So it should be 4 plots, for each bacteria, with value on the y-axis and food category on the x-axis.

I have tried this code:

library(dplyr)
genus_veg %>% group_by(Genus, Abundance) %>% summarise(Abundance = sum(Abundance)) %>% 
ggplot(aes(x = Level, y= Abundance, fill = Genus)) + geom_bar(stat="identity")

But get this error:

Error: cannot modify grouping variable

Any suggestions?

2 个答案:

答案 0 :(得分:1)

TL; DR facets 绘制

你提出的问题是超级不清楚。所以我从

解释了你的问题
  

所以,我想要一个条形图,了解每个类别中每种细菌的含量。因此,对于每种细菌,它应该是4个图,y轴上的值和x轴上的食物类别。

为:

  • 你想要一个条形图
  • 您需要4个样地,每种细菌类型一个:R,B,P,Bi
  • x轴=食品类别
  • y轴=细菌丰度

输入

关于输入数据,数据不清楚,例如你没有描述" Sample"," Level"或" Category"是。理想情况下,您可以将所有食品类别保存在一个数据框中。 e.g。

library(tidyr)
library(dplyr)
library(ggplot2)

## The categories you have defined
bacteria <- c("R", "B", "P", "Bi")
food <- c("FiberCategory", "FruitCategory", "VegetablesCategory", "WholegrainCategory")

## Create dummy data for plotting
set.seed(1)
num_rows <- length(bacteria)
num_cols <- length(food)
dummydata <- 
  matrix(data = abs(rnorm(num_rows*num_cols, mean=0.01, sd=0.05)),
         nrow=num_rows, ncol=num_cols)
rownames(dummydata) <- bacteria
colnames(dummydata) <- food
dummydata <-
  dummydata %>%
  as.data.frame() %>%
  tibble::rownames_to_column("bacteria") %>% 
  gather(food, abundance, -bacteria)

其输出如下:

#> dummydata
#   bacteria               food   abundance
#1         R      FiberCategory 0.021322691
#2         B      FiberCategory 0.019182166
#3         P      FiberCategory 0.031781431
#4        Bi      FiberCategory 0.089764040
#5         R      FruitCategory 0.026475389
#6         B      FruitCategory 0.031023419
#7         P      FruitCategory 0.034371453
#8        Bi      FruitCategory 0.046916235
#9         R VegetablesCategory 0.038789068
#10        B VegetablesCategory 0.005269419
#11        P VegetablesCategory 0.085589058
#12       Bi VegetablesCategory 0.029492162
#13        R WholegrainCategory 0.021062029
#14        B WholegrainCategory 0.100734994
#15        P WholegrainCategory 0.066246546
#16       Bi WholegrainCategory 0.007753320

剧情

如果您拥有如上格式化的数据,您只需执行以下操作:

dummydata %>%
  ggplot(aes(x = food,
             y = abundance,
             group = bacteria)) +
  geom_bar(stat="identity") +

  ## Split into 4 plots 
  ## Note: can also use 'facet_grid' to do this
  facet_wrap(~bacteria) +
  theme(
    ## rotate the x-axis label
    axis.text.x = element_text(angle=90, hjust=1, vjust=.5)
    )

enter image description here

答案 1 :(得分:1)

TL; DR 将各个地块与cowplot

结合起来

超级不清楚问题的另一种解释,这一次来自:

  

根据食物组绘制细菌&amp; R中的丰度

  

希望根据他们的食物类别绘制它们:&#34; FiberCategory&#34;,&#34; FruitCategory&#34;,&#34; VegetablesCategory&#34; &安培; &#34; WholegrainCategory&#34。我已经制作了4个单独的文件

您可能会要求:

  • 你想要一个条形图
  • 您需要4个地块,每个食物类别一个
  • x轴=细菌类型
  • y轴=细菌丰度

输入

假设您有每个食品类别的数据框。 (再次,我使用虚拟数据)

library(tidyr)
library(dplyr)
library(ggplot2)

## The categories you have defined
bacteria <- c("R", "B", "P", "Bi")
food <- c("FiberCategory", "FruitCategory", "VegetablesCategory", "WholegrainCategory")

## Create dummy data for plotting
set.seed(1)
num_rows <- length(bacteria)
num_cols <- length(food)
dummydata <- 
  matrix(data = abs(rnorm(num_rows*num_cols, mean=0.01, sd=0.05)),
         nrow=num_rows, ncol=num_cols)
rownames(dummydata) <- bacteria
colnames(dummydata) <- food
dummydata <-
  dummydata %>%
  as.data.frame() %>%
  tibble::rownames_to_column("bacteria") %>% 
  gather(food, abundance, -bacteria)


## If we have 4 data frames
filter_food <- function(dummydata, foodcat){
  dummydata %>%
    filter(food == foodcat) %>% 
    select(-food)
}
dd_fiber <- filter_food(dummydata, "FiberCategory")
dd_fruit <- filter_food(dummydata, "FruitCategory")
dd_veg <- filter_food(dummydata, "VegetablesCategory")
dd_grain <- filter_food(dummydata, "WholegrainCategory")

一个数据框看起来像

#> dd_grain
#  bacteria  abundance
#1        R 0.02106203
#2        B 0.10073499
#3        P 0.06624655
#4       Bi 0.00775332

剧情

您可以创建单独的图。 (这里,我使用函数生成我的图表)

plot_food <- function(dd, title=""){
  dd %>%
    ggplot(aes(x = bacteria, y = abundance)) +
    geom_bar(stat = "identity") +
    ggtitle(title)
}
plt_fiber <- plot_food(dd_fiber, "fiber")
plt_fruit <- plot_food(dd_fruit, "fruit")
plt_veg <- plot_food(dd_veg, "veg")
plt_grain <- plot_food(dd_grain, "grain")

然后使用cowplot

组合它们
cowplot::plot_grid(plt_fiber, plt_fruit, plt_veg, plt_grain)

enter image description here