在绘制geom_bar(position =“dodge”)时,融化数据会导致错误的Y值?

时间:2012-06-16 00:08:38

标签: r ggplot2

我有一个名为split2_data的数据帧(实际上是一个更大数据帧的drop-level属性)。 它包含一个列“Loci”,它是我想要的x轴因子,以及几列y值(注意:所有这些值都是< = 1)我想在他们的旁边彼此绘图各自的x因子。

数据框

structure(list(Loci = structure(1:8, .Label = c("C485", "C487_PigTa", 
"C536", "Carey", "Cool", "Coyote", "Deadpool", "Epstein"), class = "factor"), 
All = structure(c(5L, 6L, 7L, 1L, 2L, 4L, 3L, 8L), .Label = c("0.0246", 
"0.0352", "0.0563", "0.0646", "0.2349", "0.3242", "0.3278", 
"0.6854"), class = "factor"), X1_only = structure(c(4L, 3L, 
2L, 1L, 6L, 6L, 6L, 5L), .Label = c("0.0133", "0.7292", "0.8586", 
"0.9377", "0.961", "1"), class = "factor"), X78_only = structure(c(7L, 
6L, 4L, 5L, 8L, 3L, 1L, 2L), .Label = c("0.0018", "0.0175", 
"0.4958", "0.6055", "0.7472", "0.7563", "0.825", "1"), class = "factor"), 
X8_removed = structure(c(5L, 6L, 8L, 1L, 2L, 3L, 4L, 7L), .Label = c("0.0181", 
"0.0348", "0.1482", "0.1706", "0.2217", "0.2602", "0.6748", 
"0.7123"), class = "factor"), X8_only = structure(c(6L, 7L, 
3L, 8L, 5L, 4L, 1L, 2L), .Label = c("0.1266", "0.1945", "0.4389", 
"0.4496", "0.7078", "0.709", "0.8882", "1"), class = "factor"), 
X7_removed = structure(c(6L, 4L, 5L, 2L, 1L, 3L, 7L, 8L), .Label = c("0.0159", 
"0.02", "0.0541", "0.3232", "0.3972", "0.4226", "0.4919", 
"0.5951"), class = "factor"), X7_only = structure(c(3L, 4L, 
7L, 5L, 6L, 8L, 1L, 2L), .Label = c("0.0082", "0.1759", "0.4957", 
"0.5248", "0.6665", "0.6789", "0.8372", "1"), class = "factor"), 
X5_removed = structure(c(5L, 7L, 6L, 1L, 3L, 4L, 2L, 8L), .Label = c("0.0195", 
"0.0316", "0.08", "0.1069", "0.1549", "0.395", "0.4405", 
"0.6298"), class = "factor"), X5_only = structure(c(1L, 2L, 
6L, 7L, 3L, 5L, 7L, 4L), .Label = c("0.0871", "0.2022", "0.3532", 
"0.3677", "0.5292", "0.7602", "1"), class = "factor"), X4_removed = structure(c(8L, 
4L, 7L, 2L, 3L, 5L, 1L, 6L), .Label = c("0.0188", "0.0194", 
"0.0511", "0.1716", "0.1862", "0.6454", "0.661", "0.8003"
), class = "factor"), X4_only = structure(c(2L, 5L, 1L, 6L, 
7L, 3L, 8L, 4L), .Label = c("0.0026", "0.0378", "0.2884", 
"0.4386", "0.5116", "0.6549", "0.6928", "1"), class = "factor"), 
X3_removed = structure(c(5L, 7L, 6L, 1L, 2L, 3L, 4L, 8L), .Label = c("0.0612", 
"0.0627", "0.0808", "0.1636", "0.2728", "0.477", "0.5307", 
"0.6506"), class = "factor"), X3_only = structure(c(8L, 1L, 
7L, 2L, 4L, 6L, 3L, 5L), .Label = c("0.0225", "0.2111", "0.2471", 
"0.5087", "0.6294", "0.768", "0.8263", "0.8951"), class = "factor"), 
X2_removed = structure(c(4L, 5L, 6L, 3L, 7L, 2L, 1L, 8L), .Label = c("0.0526", 
"0.0608", "0.0854", "0.2036", "0.3168", "0.3668", "0.413", 
"0.7608"), class = "factor"), X2_only = structure(c(5L, 3L, 
6L, 4L, 2L, 8L, 1L, 7L), .Label = c("-", "0.0014", "0.0949", 
"0.1637", "0.1818", "0.5521", "0.8585", "1"), class = "factor"), 
X1_removed = structure(c(5L, 7L, 3L, 6L, 1L, 4L, 2L, 8L), .Label = c("0.0258", 
"0.031", "0.0496", "0.0676", "0.1053", "0.1439", "0.2823", 
"0.5465"), class = "factor")), .Names = c("Loci", "All", 
"X1_only", "X78_only", "X8_removed", "X8_only", "X7_removed", 
"X7_only", "X5_removed", "X5_only", "X4_removed", "X4_only", 
"X3_removed", "X3_only", "X2_removed", "X2_only", "X1_removed"
), row.names = 9:16, class = "data.frame")

我无法想到如何在基地R中做到这一点,经过对其他问题的仔细研究,这是我能想到的最好的:     库(重塑)     库(GGPLOT2)     要求(GGPLOT2)

split2_datam<-melt(split2_data,id="Loci")


p2<- ggplot(split2_datam, aes(x =Loci, y = value, color = variable, width=.15)) + geom_bar(position="dodge") + ylab("P-value")+ geom_hline(yintercept=0.05)+ opts(axis.text.x  = theme_text(angle=90, size=8)) + scale_y_discrete(breaks=seq(0,1)) + scale_fill_grey()
p2


#when I add stat="identity", the y values don't change- they just shrink relative to the x-axis
p2<- ggplot(split2_datam, aes(x =Loci, y = value, color = variable, width=.15)) + geom_bar(position="dodge", stat="identity") + ylab("P-value")+ geom_hline(yintercept=0.05)+ opts(axis.text.x  = theme_text(angle=90, size=8)) + scale_y_discrete(breaks=seq(0,1)) + scale_fill_grey()
p2

情节: enter image description here

你会注意到不同的变量通常远大于1.它们不应该是。知道是什么导致了这个/如何解决?

其他我还不知道如何做/修复的事情(也许这个问题应该交叉引用?):

  1. 我不知道为什么灰度不起作用
  2. 我不知道如何使用情节
  3. 正确制作图例比例
  4. 我不明白为什么我的专栏附加了'X'(例如“X1_only”而不是“1_only”)
  5. 非常感谢您提出任何建议!

1 个答案:

答案 0 :(得分:1)

您的数据已作为因素读入,可能是因为您的数据中混有一些“ - ”字符。

当您使用NA读取数据时,您需要将它们转换为na.strings = "-"