我正在使用ggplot geom_smooth来绘制上一年度客户群与当前年度(基于日历周)的营业额数据。由于上周未完成,我想在上周使用虚线类型。但是,我无法弄清楚如何做到这一点。我可以更改整个绘图或整个系列的线型,但不能更改系列内的线型(取决于x的值):
为了简单起见,我们只使用以下示例:
set.seed(42)
frame <- data.frame(series = rep(c('a','b'),50),x = 1:100, y = runif(100))
ggplot(frame,aes(x = x,y = y, group = series, color=series)) +
geom_smooth(size=1.5, se=FALSE)
我如何更改此项以获取x&gt; = 75的虚线?
非常感谢任何帮助!
编辑,2016-03-05
当然在尝试在原始情节上使用此方法时,我失败了。问题在于功能区,使用stat_summary和预定义函数计算。我试图在原始数据(mdf)上使用stat_summary,在smooth_data上使用geom_line。即使我注释掉其他所有内容,我仍然会得到错误:连续值提供给离散比例&#34;。我认为问题来自于原始x值(Kalenderwoche)是离散的,而新的,平滑的x是连续的。我是否必须以某种方式将一个变为另一个?我还能做什么?
这是我尝试过的(浓缩到基本路线):
quartiles <- function(x) {
x <- na.omit(x) # remove NULL
median <- median(x)
q1 <- quantile(x,0.25)
q3 <- quantile(x,0.75)
data.frame(y = median, ymin = median, ymax = q3)
}
g <- ggplot(mdf, aes(x=Kalenderwoche, y=value, group=variable, colour=variable,fill=variable))+
geom_smooth(size=1.5, method="auto", se=FALSE)
# Take out the data for smooth line
smooth_data <- ggplot_build(g)$data[[1]]
ggplot(mdf, aes(x=Kalenderwoche, y=value, group=variable, colour=variable,fill=variable))+
stat_summary(fun.data = quartiles,geom="ribbon", colour="NA", alpha=0.25)+
geom_line(data=smooth_data, aes(x=x, y=y, group=group, colour=group, fill=group))
mdf看起来像这样:
str(mdf)
'data.frame': 280086 obs. of 5 variables:
$ konto_id : int 1 1 1 1 1 1 1 1 1 1 ...
$ Kalenderwoche: Factor w/ 14 levels "2015-48","2015-49",..: 4 12 1 3 7 13 10 6 5 9 ...
$ variable : Factor w/ 2 levels "Umsatz","Umsatz Vorjahr": 1 1 1 1 1 1 1 1 1 1 ...
$ value : num 0 428.3 97.8 76 793.1 ...
有很多账户(konto_id),每个账户和日历周(Kalenderwoche),目前的营业额(Umsatz)和去年的营业额(Umsatz Vorjahr)。如果需要,我可以提供较小版本的data.frame和整个代码。
非常感谢任何帮助!
P.S。我是R的新手,所以我的代码对专业人员来说可能看起来很愚蠢,对不起:(
编辑,2016-03-06
我上传了一部分数据(mdf): mdf
原始图表的完整代码如下(对于如此少的数据看起来有些奇怪,但这不是重点;)
library(dtw)
library(reshape2)
library(ggplot2)
library(RODBC)
library(Cairo)
# custom breaks for X axis
breaks.custom <- unique(mdf$Kalenderwoche)[c(TRUE,rep(FALSE,0))]
# function called by stat_summary
quartiles <- function(x) {
x <- na.omit(x)
median <- median(x)
q1 <- quantile(x,0.25)
q3 <- quantile(x,0.75)
data.frame(y = median, ymin = median, ymax = q3)
}
# Positions for guidelines and labels
horizontal.center <- (length(unique(mdf$Kalenderwoche))+1)/2
kw.horizontal.center <- as.vector(sort(unique(mdf$Kalenderwoche))[c(horizontal.center-0.5,horizontal.center+0.5)])
vpos.P75.label <- max(quantile(mdf$value[mdf$Kalenderwoche==kw.horizontal.center[1]],0.75)
,quantile(mdf$value[mdf$Kalenderwoche==kw.horizontal.center[2]],0.75))+10
# use the higher P75 value of the two weeks around the center
vpos.mean.label <- min(mean(mdf$value[mdf$Kalenderwoche==kw.horizontal.center[1]])
,mean(mdf$value[mdf$Kalenderwoche==kw.horizontal.center[2]]))-10
vpos.median.label <- min(median(mdf$value[mdf$Kalenderwoche==kw.horizontal.center[1]])
,median(mdf$value[mdf$Kalenderwoche==kw.horizontal.center[2]]))-10
hpos.vline <- which(as.vector(sort(unique(mdf$Kalenderwoche))=="2016-03"))
# custom colour palette (2 colors)
cbPaletteLine <- c("#DA2626", "#2626DA")
cbPaletteFill <- c("#F0A8A8", "#7C7CE9")
# ggplot
ggplot(mdf, aes(x=Kalenderwoche, y=value, group=variable, colour=variable,fill=variable))+
geom_smooth(size=1.5, method="auto", se=FALSE)+
# SE=FALSE to suppress drawing of the SE of the fit.SE of the data shall be used instead:
stat_summary(fun.data = quartiles,geom="ribbon", colour="NA", alpha=0.25)+
scale_x_discrete(breaks=breaks.custom)+
scale_colour_manual(values=cbPaletteLine)+
scale_fill_manual(values=cbPaletteFill)+
#coord_cartesian(ylim = c(0, 250)) +
theme(legend.title = element_blank(), title = element_text(face="bold", size=12))+
#scale_color_brewer(palette="Dark2")+
labs(title = "Tranche 1", x = "Kalenderwoche", y = "Konto-Umsatz [CHF]")+
geom_vline(xintercept = hpos.vline, linetype=2)+
annotate("text", x=horizontal.center, y=vpos.median.label, label = "Median", size=4)+
annotate("text", x=horizontal.center, y=vpos.mean.label, label= "Mean", size=4)+
annotate("text", x=horizontal.center, y=vpos.P75.label, label = "P75%", size=4)+
theme(axis.text.x=element_text(angle = 90, hjust = 0.5, vjust = 0.5))
编辑,2016-03-06
答案 0 :(得分:4)
我不太确定如何平滑所有数据并通过geom_smooth
函数为子集使用不同的线型。我的想法是提取ggplot用于构建绘图的数据,并使用geom_line
来重现它。这就是我这样做的方式:
set.seed(42)
frame <- data.frame(series=rep(c('a','b'), 50),
x = 1:100, y = runif(100))
library(ggplot2)
g <- ggplot(frame, aes(x=x, y=y, color=series)) + geom_smooth(se=FALSE)
# Take out the data for smooth line
smooth_data <- ggplot_build(g)$data[[1]]
ggplot(smooth_data[smooth_data$x <= 76, ], aes(x=x, y=y, color=as.factor(group), group=group)) +
geom_line(size=1.5) +
geom_line(data=smooth_data[smooth_data$x >= 74, ], linetype="dashed", size=1.5) +
scale_color_discrete("Series", breaks=c("1", "2"), labels=c("a", "b"))
你是对的。问题是您在原始图层中向连续x添加连续x。处理它的一种方法是创建一个查找表,在这种情况下,它很容易,因为x是1到14的序列。我们可以通过索引转换离散x。在您的代码中,如果您添加:
,它应该可以使用level <- levels(mdf$Kalenderwoche)
ggplot(mdf, aes(x=Kalenderwoche, y=value, group=variable, colour=variable,fill=variable))+
stat_summary(fun.data = quartiles,geom="ribbon", colour="NA", alpha=0.25) +
geom_line(data=smooth_data, aes(x=level[x], y=y, group=group, colour=as.factor(group), fill=NA))
以下是我对这个问题的尝试:
g <- ggplot(mdf, aes(x=Kalenderwoche, y=value, group=variable, colour=variable,fill=variable)) +
geom_smooth(size=1.5, method="auto", se=FALSE) +
# SE=FALSE to suppress drawing of the SE of the fit.SE of the data shall be used instead:
stat_summary(fun.data = quartiles,geom="ribbon", colour="NA", alpha=0.25)
smooth_data <- ggplot_build(g)$data[[1]]
ribbon_data <- ggplot_build(g)$data[[2]]
# Use them as lookup table
level <- levels(mdf$Kalenderwoche)
clevel <- levels(mdf$variable)
ggplot(smooth_data[smooth_data$x <= 13, ], aes(x=level[x], y=y, group=group, color=as.factor(clevel[group]))) +
geom_line(size=1.5) +
geom_line(data=smooth_data[smooth_data$x >= 13, ], linetype="dashed", size=1.5) +
geom_ribbon(data=ribbon_data,
aes(x=x, ymin=ymin, ymax=ymax, fill=as.factor(clevel[group]), color=NA), alpha=0.25) +
scale_x_discrete(breaks=breaks.custom) +
scale_colour_manual(values=cbPaletteLine) +
scale_fill_manual(values=cbPaletteFill) +
#coord_cartesian(ylim = c(0, 250)) +
theme(legend.title = element_blank(), title = element_text(face="bold", size=12))+
#scale_color_brewer(palette="Dark2")+
labs(title = "Tranche 1", x = "Kalenderwoche", y = "Konto-Umsatz [CHF]")+
geom_vline(xintercept = hpos.vline, linetype=2)+
annotate("text", x=horizontal.center, y=vpos.median.label, label = "Median", size=4)+
annotate("text", x=horizontal.center, y=vpos.mean.label, label= "Mean", size=4)+
annotate("text", x=horizontal.center, y=vpos.P75.label, label = "P75%", size=4)+
theme(axis.text.x=element_text(angle = 90, hjust = 0.5, vjust = 0.5))
请注意,图例有边框。