Question

我正在使用R来生成某些指标的一些图表，并为具有＆gt;的数据获得这样的好结果3个数据点：

然而，我注意到对于只有少量数据的数据 - 我的结果非常糟糕。

如果我绘制只有两个数据点的图，我会得到一个空白图。 foo_two_points.dat

cluster,account,current_database,action,operation,count,day
cluster19,col0063,col0063,foo_two,two_bar,10,2016-10-04 00:00:00-07:00
cluster61,dwm4944,dwm4944,foo_two,two_bar,2,2016-12-14 00:00:00-08:00

如果我绘制一个数据点，它就可以工作 foo_one_point.dat

cluster,account,current_database,action,operation,count,day
cluster1,foo0424,foo0424,fooone,,2,2016-11-01 00:00:00-07:00

三，它几乎可行，但不准确 foo_three_points.dat

cluster,account,current_database,action,operation,count,day
cluster23,col2225,col2225,foo_three,bar,9,2016-12-22 00:00:00-08:00
cluster23,col2225,col2225,foo_three,bar,1,2016-12-29 00:00:00-08:00
cluster12,red1782,red1782,foo_three,bar,2,2016-10-25 00:00:00-07:00

所有人看起来都很好

但是两三点 - 不。

这是我的plot.r文件：

library(ggplot2)
library(scales)

args<-commandArgs(TRUE)

filename<-args[1]
n = nchar(filename) - 4
thetitle = substring(filename, 1, n)
print(thetitle)
png_filename <- stringi::stri_flatten(stringi::stri_join(c(thetitle,'.png')))

wide<-as.numeric(args[2])
high<-as.numeric(args[3])
legend_left<-as.numeric(args[4])

pos <- if(legend_left == 1) c(1,0)  else c(0,1) 
place <- if(legend_left == 1) 'left'  else 'right'

print(wide)
print(high)

print(filename)
print(png_filename)

dat = read.csv(filename)

dat$account = as.character(dat$account)
dat$action=as.character(dat$action)
dat$operation = as.character(dat$operation)
dat$count = as.integer(dat$count)
dat$day = as.Date(dat$day)
dat[is.na(dat)]<-"N/A"

png(png_filename,width=wide,height=high)

p <- ggplot(dat, aes(x=day, y=count, fill=account, labels=TRUE)) 
p <- p + geom_histogram(stat="identity") 
p <- p + scale_x_date(labels=date_format("%b-%Y"), limits=as.Date(c('2016-10-01','2017-01-01')))
p <- p + theme(legend.position="bottom")
p <- p + guides(fill=guide_legend(nrow=5, byrow=TRUE))
p <- p + theme(text = element_text(size=15)) 
p<-p+labs(title=thetitle)

print(p)

dev.off()

这是我用来运行它的命令：

RScript plot.r foo_five_points.dat 1600 800 0

我做错了什么？

Answer 1

我不知道这是不是一个错误，我认为它实际上是按设计进行的，并且当它们溢出到极限时，条形图会被修剪。

我还认为这更像是geom_bar而不是geom_histogram，因为这似乎不是分发数据，但这与问题无关，两者的行为都相同。 / p>

一种解决方案是在width中明确设置geom_histo参数，而不是让它计算出来：

p <- ggplot(dat, aes(x=day, y=count, fill=account, labels=TRUE)) 
p <- p + geom_histogram(stat="identity",width=1) 
p <- p + scale_x_date(labels=date_format("%b-%Y"), limits=as.Date(c('2016-10-1','2017-01-01')))
p <- p + theme(legend.position="bottom")
p <- p + guides(fill=guide_legend(nrow=5, byrow=TRUE))
p <- p + theme(text = element_text(size=15)) 
p<-p+labs(title=thetitle)

那么上面空白的两点例子就是给你的 - 这似乎是正确的：

当您拥有大量数据时，无法确定明确设置宽度是否有效且条形需要变小 - 我想您可以有条件地设置它。

ggplot2无法绘制只有两个或三个数据点的正确图

1 个答案: