泡泡图 - R.

时间:2014-07-17 09:59:28

标签: r statistics

我想为个人数量制作两个阶段的气泡图;最终目的是看基因型是否在2个阶段得分相同。所以我希望Stage轴在x轴上,Stage_2在y轴上

我真的很喜欢this tutorial,但我不知道在圈子里放什么

        Geno  Stage_1 Stage_2
Individual_1        9     8.1
Individual_2      3.1       1
Individual_3      4.1       2
Individual_4        9     6.1
Individual_5      2.9       1
Individual_6      4.1     1.4
Individual_7      4.4     1.5
Individual_8        3       1
Individual_9      3.1     1.3
Individual_10     4.1     1.8
Individual_11     8.3       4
Individual_12     8.6     5.5
Individual_13       9     5.3
Individual_14       9     4.3
Individual_15       7       2
Individual_16       9     5.8
Individual_17       9     6.4
Individual_18     5.4     1.1
Individual_19     5.8     2.3
Individual_20     5.3     1.5
Individual_21       9     6.8
Individual_22       8     3.3
Individual_23     8.1     7.6

1 个答案:

答案 0 :(得分:2)

@Osssan是现货。这将是对泡沫图的不恰当使用,因为您正在寻找不同元素的各个阶段之间的比较(即,您正在比较多个类别中的值)并且没有适当的泡沫图所需的三个维度。即:

# NOTE: dput(VARIABLE) is a much better way to post data into SO posts:

dat <- structure(list(Geno = structure(c(1L, 12L, 17L, 18L, 19L, 20L, 
                  21L, 22L, 23L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 13L, 
                  14L, 15L, 16L), .Label = c("Individual_1", "Individual_10", "Individual_11", 
                  "Individual_12", "Individual_13", "Individual_14", "Individual_15", 
                  "Individual_16", "Individual_17", "Individual_18", "Individual_19", 
                  "Individual_2", "Individual_20", "Individual_21", "Individual_22", 
                  "Individual_23", "Individual_3", "Individual_4", "Individual_5", 
                  "Individual_6", "Individual_7", "Individual_8", "Individual_9"
                  ), class = "factor"), Stage_1 = c(9, 3.1, 4.1, 9, 2.9, 4.1, 4.4, 
                  3, 3.1, 4.1, 8.3, 8.6, 9, 9, 7, 9, 9, 5.4, 5.8, 5.3, 9, 8, 8.1
                  ), Stage_2 = c(8.1, 1, 2, 6.1, 1, 1.4, 1.5, 1, 1.3, 1.8, 4, 5.5, 
                  5.3, 4.3, 2, 5.8, 6.4, 1.1, 2.3, 1.5, 6.8, 3.3, 7.6)), .Names = c("Geno", 
                  "Stage_1", "Stage_2"), class = "data.frame", row.names = c(NA, -23L))

# get difference between stages

dat$diff = dat$Stage_2 - dat$Stage_1

# simple barplot

gg <- ggplot(dat, aes(x=reorder(Geno, dat$diff), y=dat$diff))
gg <- gg + geom_bar(stat="identity", width=0.25, fill="steelblue")
gg <- gg + labs(x="", y="Genotype Stage 1/2 Diff", title="Genotype Stage Comparison")
gg <- gg + coord_flip()
gg <- gg + theme_bw()
gg <- gg + theme(panel.border=element_blank())
gg <- gg + theme(panel.grid=element_blank())
gg 

enter image description here

# bubble plot

dat$label <- gsub("Individual_", "", dat$Geno)

gg <- ggplot(dat, aes(x=Stage_1, y=Stage_2))
gg <- gg + geom_point(aes(size=diff, color=Geno))
gg <- gg + geom_text(aes(label=label), size=4, hjust=1.5)
gg <- gg + theme_bw()
gg <- gg + theme(legend.position="none")
gg

enter image description here

很明显条形图显示哪些基因型在阶段之间的差异比气泡图更直观(人们可以尝试更好地扩大气泡,但它仍然会使它变得更难识别/比较并且实际上不能很好地利用该图表类型。)