R瀑布图x值

时间:2014-08-11 16:12:41

标签: r graph ggplot2

我是新人,自学成语,所以请耐心等待我。

我正在尝试使用以下数据框[名为balancefinal]:

绘制瀑布图
X1  X2  type    end start   id
Actual  1.09112725  Actual  1.09112725  0   1
Actual Est  1.345028317 Estimated   1.345028317 0   2
Factor1 -0.28842558 Change  1.056602737 1.345028317 3
Factor2 -0.091360211    Change  0.965242526 1.056602737 4
Factor3 0.110622374 Change  1.0758649   0.965242526 5
Factor4 0.227710095 Change  1.303574995 1.0758649   6
Factor5 0.27353189  Change  1.577106884 1.303574995 7
Factor6 0.006879353 Change  1.583986238 1.577106884 8
Factor7 0.135077259 Change  1.719063497 1.583986238 9
Factor8 0.00591948  Change  1.724982976 1.719063497 10
Factor9 0.077394066 Change  1.802377042 1.724982976 11
Factor10    0.05212228  Change  1.854499322 1.802377042 12
Factor11    0.062991126 Change  1.917490448 1.854499322 13
Actual Est  1.836828552 Estimated   0   1.917490448 14
ActualB 1.278994    Actual  1.278994    0   15

我尝试了以下代码(从here重新排列代码)

堆叠x变量名称的代码:strwr <- function(str) gsub(" ", "\n", str)

  ggplot(balancefinal, aes(X1, fill = type)) + geom_rect(aes(x = X1,  xmin = id - 0.45, xmax = id + 0.45, ymin = end, ymax = start), colour = "black") + 
    scale_fill_manual (values =c("#90353B", "#1A476F","burlywood1"))  + scale_x_discrete("", breaks = levels(balancefinal$X1), labels = strwr(levels(balancefinal$X1))) +
    theme(legend.position = "bottom", legend.title=element_blank(), axis.text= element_text(size = 8), plot.title = element_text(vjust = 1)) + 
    scale_y_continuous("", labels = dollar, breaks = seq(0, max(balancefinal$start) + 0.25, by = 0.25)) 

理想情况下,x轴中变量的顺序应为Actual,Actual Est,Factor1,...,Factor11,Actual Est,ActualB。情况并非如此,似乎表示X2值的框是正确的,但标签的顺序却不正确。你知道为什么会这样吗?任何帮助/提示将非常感激。

PS:对不起,如果这是一个基本问题。如果你认为它是让我知道,我将删除它。

1 个答案:

答案 0 :(得分:1)

设置数据:

text <- "X1  X2  type    end start   id
Actual  1.09112725  Actual  1.09112725  0   1
Actual Est  1.345028317 Estimated   1.345028317 0   2
Factor1 -0.28842558 Change  1.056602737 1.345028317 3
Factor2 -0.091360211    Change  0.965242526 1.056602737 4
Factor3 0.110622374 Change  1.0758649   0.965242526 5
Factor4 0.227710095 Change  1.303574995 1.0758649   6
Factor5 0.27353189  Change  1.577106884 1.303574995 7
Factor6 0.006879353 Change  1.583986238 1.577106884 8
Factor7 0.135077259 Change  1.719063497 1.583986238 9
Factor8 0.00591948  Change  1.724982976 1.719063497 10
Factor9 0.077394066 Change  1.802377042 1.724982976 11
Factor10    0.05212228  Change  1.854499322 1.802377042 12
Factor11    0.062991126 Change  1.917490448 1.854499322 13
Actual Est  1.836828552 Estimated   0   1.917490448 14
ActualB 1.278994    Actual  1.278994    0   15"

text <- gsub("[ ]+"," ",text)
text <- gsub("Actual Est", "ActualEst",text) # Ugh, dealing with spaces in a name.

balancefinal <- read.delim(textConnection(text), header = TRUE, sep=" ",strip.white=TRUE)

library(ggplot2)
strwr <- function(str) gsub(" ", "\n", str)
dollars <- paste0("$",seq(0, max(balancefinal$start) + 0.25, by = 0.25))

所以你必须小心R中的因素。你有一个重复的因素&#39; ActualEst&#39;两次,ggplot2将它们组合在一起,并在图表上按字母顺序列出它们(R如何计算您的数据)。

让我们重构并重命名额外的级别:

# You have to be careful how you define factors in R:
balancefinal$X1 = as.character(balancefinal$X1)
balancefinal$X1[14] = "ActualEst2"  # We need to make this a new factor (unique label)

# Here we reorder your factors in X1
balancefinal$X1_1 <- factor(balancefinal$X1, as.character(balancefinal$X1))

ggplot(balancefinal, aes(X1_1, fill = type)) +
  geom_rect(aes(x = X1_1,  xmin = id - 0.45, xmax = id + 0.45, ymin = end, ymax = start), colour = "black") + 
  scale_fill_manual (values =c("#90353B", "#1A476F","burlywood1")) +
  scale_x_discrete("", breaks = levels(balancefinal$X1_1), labels = strwr(levels(balancefinal$X1_1))) +
  theme(legend.position = "bottom", legend.title=element_blank(),
    axis.text= element_text(size = 8), plot.title = element_text(vjust = 1)) + 
  scale_y_continuous("", labels = dollars, breaks = seq(0, max(balancefinal$start) + 0.25, by = 0.25))

现在它有效,尽管你有&#39; ActualEst2&#39;而不是&#39; ActualEst&#39;。