Phyloseq中的ggplot2对象 - 如何重新排序x轴条目?

时间:2017-04-10 05:06:19

标签: r ggplot2 phyloseq

mapfile = "map_soil_final3.txt"
map = import_qiime_sample_data(mapfile)
print(map)

tree = read_tree("rep_set.tre")

biom = "otu_table_15000_json.biom"
biomfile = import_biom(biom,parseFunction=parse_taxonomy_default)

testdata = merge_phyloseq(biomfile,tree,map)
print(testdata)

p = plot_bar(testdata, "Order", fill = "Phylum", facet_grid = ~Description) +
             ylab("Percentage of Sequences") 
relative_ab = p + geom_bar(aes(color = Phylum, fill = Phylum),
                           stat = "identity", position = "stack") 
relative_ab

Sampleplot

我在这里有一些各种类群的情节。每个条形代表Phlyum(颜色)内生物的顺序(x轴上的名称)。现在订单按字母顺序排列,但这导致Phlya到处都是。如果我可以根据Phylum将订单组合在一起,那就太好了。所以基本上所有颜色都会组合在一起。有人可以帮助我吗?谢谢!

https://www.dropbox.com/sh/5xn5si352bgslg0/AADyI_ON39_55qvNdvB167Lga?dl=0

> str(ent10)
Formal class 'phyloseq' [package "phyloseq"] with 5 slots
..@ otu_table:Formal class 'otu_table' [package "phyloseq"] with 2    slots
.. .. ..@ .Data        : num [1:20, 1:73] 0 11 86 237 11 8 16 4 15 19   ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:20] "4128270" "2473794" "811074" "4388819" ...
.. .. .. .. ..$ : chr [1:73] "AB17S" "UR6S" "AB4S" "AB8S" ...
.. .. ..@ taxa_are_rows: logi TRUE
..@ tax_table:Formal class 'taxonomyTable' [package "phyloseq"] with 1 slot
.. .. ..@ .Data: chr [1:20, 1:7] "k__Bacteria" "k__Bacteria"  "k__Bacteria" "k__Bacteria" ...
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:20] "4128270" "2473794" "811074" "4388819" ...
.. .. .. .. ..$ : chr [1:7] "Kingdom" "Phylum" "Class" "Order" ...
..@ sam_data :'data.frame': 73 obs. of  4 variables:
Formal class 'sample_data' [package "phyloseq"] with 4 slots
.. .. ..@ .Data    :List of 4
.. .. .. ..$ : Factor w/ 73 levels "AB10S","AB11S",..: 8 70 13 17 9 11   66 15 12 22 ...
.. .. .. ..$ : Factor w/ 73 levels "D1_pb_s.fasta",..: 67 45 34 50 70 26 29 42 30 14 ...
.. .. .. ..$ : Factor w/ 4 levels "Ash_Basins","Pond_B",..: 1 4 1 1 1 1 4 1 1 2 ...
.. .. .. ..$ : Factor w/ 31 levels "D1","D10","D11",..: 29 21 19 23 30 17 17 21 18 13 ...
.. .. ..@ names    : chr [1:4] "X.SampleID" "InputFileName" "Description" "TagCombo"
.. .. ..@ row.names: chr [1:73] "AB17S" "UR6S" "AB4S" "AB8S" ...
.. .. ..@ .S3Class : chr "data.frame"
..@ phy_tree :List of 5
.. ..$ edge       : int [1:38, 1:2] 21 22 23 23 22 24 25 25 24 21 ...
.. ..$ Nnode      : int 19
.. ..$ tip.label  : chr [1:20] "4128270" "2473794" "811074" "4388819" ...
.. ..$ edge.length: num [1:38] 0.00016 0.02274 0.3467 0.80367 0.00564 ...
.. ..$ node.label : chr [1:19] "1.000" "0.815" "0.922" "0.860" ...
.. ..- attr(*, "class")= chr "phylo"
.. ..- attr(*, "order")= chr "cladewise"
..@ refseq   : NULL

1 个答案:

答案 0 :(得分:3)

添加额外的geom_bar图层不会对您有所帮助。实际上,您甚至不需要额外的ggplot2代码来实现您想要的组织,因为ggplot2会解释因子级别来决定这一点,您可以使用基本R命令修改它们。以下是一个完全可重现的示例,包括一些额外的漂亮标签主题,这是唯一需要ggplot2特定命令的地方。每个pp + ...行都是简单的标注,用于渲染结果的中间图,如果您想要的只是最后的图形,您可以在实践中跳过这些。

# Preliminaries
rm(list = ls())
library("phyloseq"); packageVersion("phyloseq")
data("GlobalPatterns")
N = 100L
gpN = prune_taxa(names(sort(taxa_sums(GlobalPatterns), decreasing = TRUE)[1:N]), GlobalPatterns)
# Define the initial plot
p = plot_bar(gpN, fill = "Phylum", x = "Order")
p
# Adjust the factor levels. No ggplot2 commands needed. Already what you want.
p$data$Order <- as.character(p$data$Order)
p$data$Order <- factor(x = p$data$Order, 
                       levels = unique(p$data$Order[order(as.character(p$data$Phylum))]))
p
# pretty the axis labels with ggplot2
library("ggplot2")
p + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))