geom_bar无效y.axis值R.

时间:2014-11-19 15:08:38

标签: r plot ggplot2

我有以下数据框:

> str(drivePerTaskMelted)
'data.frame':   10508 obs. of  4 variables:
 $ CSS_WEEK_END_DATE: Date, format: "2012-01-13" "2012-01-20" "2012-01-27" "2012-02-03" ...
 $ patch            : Factor w/ 71 levels "BV","BVG","BVH",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Drive.Per.Task   : num  28 28.8 28.2 28.1 27.9 26.4 26.6 26.6 26.6 26.7 ...
 $ Months           : chr  "January" "January" "January" "February" ...

我想绘制一个条形图:

ggplot(drivePerTaskMelted[patch==c("BVG1","BVG2","BVG3","BVG4"),],
aes(x=patch, y=Drive.Per.Task,fill=patch)) + 
geom_bar(stat="identity") + 
geom_text(aes(label = max(Drive.Per.Task, na.rm = TRUE)))

这绘制了以下情节:

enter image description here

我使用了stat="identity"但它仍然没有按原样使用y.values。 y值类似于28,28.2等。此外,我试图在每个条形顶部标记最大y.axis值。但它在底部以奇怪的方式显示35.2。

例如:BVG1的摘要是:

> summary(drivePerTaskMelted[patch=="BVG1",])
 CSS_WEEK_END_DATE        patch     Drive.Per.Task     Months         
 Min.   :2012-01-13   BVG1   :148   Min.   :22.60   Length:148        
 1st Qu.:2012-09-26   BV     :  0   1st Qu.:28.38   Class :character  
 Median :2013-06-10   BVG    :  0   Median :30.20   Mode  :character  
 Mean   :2013-06-10   BVH    :  0   Mean   :30.08                     
 3rd Qu.:2014-02-22   BVG2   :  0   3rd Qu.:31.70                     
 Max.   :2014-11-07   BVG3   :  0   Max.   :35.90                     
                      (Other):  0                        

谢谢你,

2 个答案:

答案 0 :(得分:1)

这可能会产生你想要的东西,但没有你的数据集是不可能测试的。这会为每个Drive.Per.Task创建平均patch的条形图,并在条形图上方显示最大Drive.Per.Task

# not tested
library(ggplot2)
labs <- function(x) data.frame(y=mean(x)+0.2,label=round(max(x),2))
ggplot(drivePerTaskMelted[patch %in% c("BVG1","BVG2","BVG3","BVG4"),],
       aes(x=patch, y=Drive.Per.Task,fill=patch)) + 
  stat_summary(fun.y=mean,geom="bar")+
  stat_summary(fun.data=labs,geom="text")

这假定在数据框patch之外定义了向量drivePerTaskMelted

另请注意,patch %in% c("BVG1","BVG2","BVG3","BVG4") patch==c("BVG1","BVG2","BVG3","BVG4")不同。。前者是提取包含BVG1 - 4的行的正确方法。

以下是使用内置mtcars数据集的工作演示。

# use built-in mtcars dataset for demonstration
df <- mtcars
df$cyl <- as.factor(df$cyl)   # number of cylinders to factor

labs <- function(x) data.frame(y=mean(x)+0.2,label=round(max(x),2))
library(ggplot2)
ggplot(df,aes(x=cyl,y=wt,fill=cyl))+
  stat_summary(fun.y=mean,geom="bar")+
  stat_summary(fun.data=labs,geom="text")

答案 1 :(得分:0)

我的猜测是有很多CSS_WEEK_END_DATE列,你看到所有的总和。你在看具体日期吗?你能运行以下内容并查看条形图/值现在是否更好?

ggplot(drivePerTaskMelted[patch %in% c("BVG1","BVG2","BVG3","BVG4"),],aes(x=patch, y=Drive.Per.Task,fill=patch)) + 
geom_bar(stat="identity") + 
geom_text(aes(label = max(Drive.Per.Task, na.rm = TRUE)))+
facet_wrap(~ CSS_WEEK_END_DATE))