我的某些数据包含NA值,我希望这些值显示在每个栏的顶部。
library("phyloseq"); packageVersion("phyloseq")
library(ggplot2)
library(scales)
data("GlobalPatterns")
TopNOTUs <- names(sort(taxa_sums(GlobalPatterns), TRUE)[1:40])
gp.ch <- prune_species(TopNOTUs, GlobalPatterns)
mdf = psmelt(gp.ch)
mdf$group <- paste0(mdf$Phylum, "-", mdf$Genus, sep = "")
mdf <- as.data.frame(mdf)
mdf$Genus <- as.character(mdf$Genus)
mdf[is.na(mdf)] <- 0
# Plot resultss
ggplot(mdf, aes(Phylum)) +
geom_bar(aes(fill = group), colour = "grey", position = "stack")
现在NA元素出现在每个条的中间,因为堆栈是按字母顺序组织的,我如何才能使NA元素成为每个堆栈中的顶部元素?
答案 0 :(得分:1)
您可以将NA
更改为字符串表示形式,然后在绘制之前对因子进行重新排序。
有几种方法可以做到这一点,这是一种整洁的方法:
library(tidyverse)
levs <- levels(data$model)
# see below for where the data comes from
data %>%
mutate(model = fct_explicit_na(model, "NA"),
model = factor(model, levels = c("NA", levs))) %>%
ggplot(aes(make)) +
geom_bar(aes(fill = model), position = "stack")
对于数据,我使用了精简的mtcars
版本:
# using a stripped-down version of mtcars
data <- mtcars %>%
rownames_to_column("car_type") %>%
filter(stringr::str_detect(car_type, "Merc|Mazda|Toyota")) %>%
separate(car_type, c("make", "model"), extra = "drop") %>%
mutate(model = factor(model, levels = c("RX4", "230", "Corolla")))
data
make model mpg cyl disp hp drat wt qsec vs am gear carb
1 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2 Mazda RX4 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
3 Merc <NA> 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
4 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
5 Merc <NA> 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
6 Merc <NA> 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
7 Merc <NA> 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
8 Merc <NA> 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
9 Merc <NA> 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
10 Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
11 Toyota <NA> 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1