如何使用csv文件中的实际观察数正确地标注堆栈条形图?

时间:2016-11-27 21:15:25

标签: r csv ggplot2

我已经实现了接受data.frame列表作为输入的函数,然后按阈值过滤掉。现在我可以将过滤后的结果导出为csv文件。为了更好地理解输出中每个观察的数量,获得带注释的叠加条形图可能是很好的选择。如何获取csv文件列表的带注释的条形图?任何人都可以给我可能的想法来实现我想要的输出吗?如何操作csv文件获取堆栈条形图?任何的想法 ?非常感谢

可重现的数据:

output <- list(
  bar = data.frame(begin=seq(2, by=14, len=45), end=seq(9, by=14, len=45), score=sample(60,45)),
  cat = data.frame(begin=seq(5, by=21, len=36), end=seq(13, by=21, len=36), score=sample(75,36)),
  foo = data.frame(begin=seq(8, by=18, len=52), end=seq(15, by=18, len=52), score=sample(100,52))
)

我实现了这个函数来按阈值过滤输入列表:

myFunc <- function(mList, threshold) {
  # check input param
  stopifnot(is.numeric(threshold))
  res <- lapply(mList, function(elm) {
    split(elm, ifelse(elm$score >= threshold, "saved", "droped"))
  })
  rslt <- lapply(names(res), function(elm) {
    mapply(write.csv,
           res[[elm]],
           paste0(elm, ".", names(res[[elm]]), ".csv"))
  })
  return(rslt)
}

#' @example 
myFunc(output, 10)

现在我得到了csv文件列表,我打算为每个文件栏添加带有注释的堆栈条形图,并显示实际观察次数。我怎样才能有效地实现这一目标?

这是所需情节的模型:

enter image description here

1 个答案:

答案 0 :(得分:3)

原始答案(编辑前/评论):

d   <- dir()[grepl("\\.droped", dir())]
s   <- dir()[grepl("\\.saved", dir())]
dropped <- as.numeric()
for(i in d){
  dropped <- c(dropped,nrow(read.csv(i)))
}
saved <- as.numeric()
for(i in s){
  saved <- c(saved,nrow(read.csv(i)))
}
tmp1 <- cbind(dropped,saved)

# Stacked Bar Plot with Colors and Legend    
barplot(tmp1, main="CSV File Row Counts",
        xlab="Number of Obs.", col=c("darkblue","red", "green"),
        legend = c("cat", "bar", "foo"))

enter image description here

修改后的答案(编辑后):

根据评论/编辑,我修改了图表以在片段中包含标签:

require(ggplot2)
Data      <- data.frame(obs    = c(tmp,tmp0),
                        # could get name from "output" to make it programmatic:
                        name   = c("cat", "foo", "bar"), 
                        filter = c(rep("Dropped",length(dropped)),
                                      rep("Saved", length(saved)))
)

ggplot(Data, aes(x = filter, y = obs, fill = name, label = obs)) +
  geom_bar(stat = "identity") +
  geom_text(size = 3, position = position_stack(vjust = 0.5))

enter image description here