我有一个数据集,其中包括来自100个模拟列车运行的数据,这些模拟网络中有4列火车,6个站点和每个站点的每列火车到达时的迟到。我的数据看起来像这样:
MyData <- data.frame(
Simulation = rep(sort(rep(1:100, 6)), 4),
Train_number = sort(rep(c(100, 102, 104, 106), 100*6)),
Stations = rep(c("ST_1", "ST_2", "ST_3", "ST_4", "ST_5", "ST_6"), 100*4),
Arrival_Lateness = c(rep(0, 60), rexp(40, 1), rep(0, 60), rexp(40, 2), rep(0, 60), rexp(40, 3), rep(0, 60), rexp(40, 5))
)
我现在使用自定义分位数为每个火车和火车站创建箱图(感谢jlhoward):
f <- function(x) {
r <- quantile(x, probs = c(0.05, 0.25, 0.5, 0.75, 0.95))
names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
r
}
ggplot(MyData, aes(factor(Stations), Arrival_Lateness, fill = factor(Train_number))) +
stat_summary(fun.data = f, geom="boxplot", position="dodge")
很漂亮:
我现在缺少的是异常值。我想在每个箱图的汤姆上绘制每个火车/车站组合的前5%的观测值。我尝试的是这个(受this question启发):
q <- function(x) {
subset(x, quantile(x, 0.95) < x)
}
ggplot(MyData, aes(factor(Stations), Arrival_Lateness, fill = factor(Train_number))) +
stat_summary(fun.data = f, geom="boxplot", position="dodge") +
stat_summary(fun.y = q, geom="point", position="dodge")
我收到一条消息:“ymax未定义:使用y调整位置”,我的图表如下所示:
这显然不是我想要的。
答案 0 :(得分:6)
此?
ggplot(MyData, aes(factor(Stations), Arrival_Lateness,
fill = factor(Train_number))) +
stat_summary(fun.data = f, geom="boxplot",
position=position_dodge(1))+
stat_summary(aes(color=factor(Train_number)),fun.y = q, geom="point",
position=position_dodge(1))
恕我直言,这有点容易理解。
ggplot(MyData, aes(factor(Train_number), Arrival_Lateness,
fill = factor(Train_number))) +
stat_summary(fun.data = f, geom="boxplot",
position=position_dodge(1))+
stat_summary(aes(color=factor(Train_number)),fun.y = q, geom="point",
position=position_dodge(1))+
facet_grid(.~Stations, scales="free")+
theme(axis.text.x=element_text(angle=-90,hjust=1,vjust=0.2))+
labs(x="Train Number")
编辑(对OP&#39评论的回应)
ggplot(MyData, aes(factor(Train_number), Arrival_Lateness,
fill = factor(Train_number))) +
stat_summary(fun.data = f, geom="boxplot",
position=position_dodge(1))+
stat_summary(aes(color=factor(Train_number)),fun.y = q, geom="point",
position=position_dodge(1))+
facet_grid(.~Stations, scales="free")+
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank())+
scale_fill_discrete("Train")+scale_color_discrete("Train")+
labs(x="")
要关闭x轴文字和刻度线,我们theme(...=element_blank())
。要关闭轴标签,请使用labs(x="")
。此外,填充和颜色标度必须具有相同的名称,或者它们单独显示。