我正在研究有关银行业务的非违约者和违约者。在这种情况下,我正在绘制他们的分布相对于条形图中的某些分数。分数越高,信用评级越高。
由于默认值的数量与非默认值相比非常有限,因此在同一条形图上绘制默认值和非默认值并不是非常有用,因为您几乎看不到默认值。然后我根据违约者的得分制作第二个条形图,但是在与违约者和非违约者的得分的完整条形图相同的间隔尺度上。然后,我想在第一个条形图中添加垂直线,指示最高的违约者得分位置和最低的违约者得分。这是为了了解违约者的分布在违规者和非违约者的整体分布中的位置。
以下是我使用的代码替换为(种子)随机数据。
library(ggplot2)
#NDS represents non-defaults and DS defaults on the same scale
#although here being just some random normals for the sake of simplicity.
set.seed(10)
NDS<-rnorm(10000,sd=1)-2
DS<-rnorm(100,sd=2)-5
#Cutoffs are constructed such that intervals of size 0.3
#contain all values of NDS & DS
minCutoff<--9.3
maxCutoff<-2.1
#Generate the actual interval "bins"
NDS_CUT<-cut(NDS,breaks=seq(minCutoff, maxCutoff, by = 0.3))
DS_CUT<-cut(DS,breaks=seq(minCutoff, maxCutoff, by = 0.3))
#Manually generate where to put the vertical lines for min(DS) and max(DS)
minDS_bar<-levels(cut(NDS,breaks=seq(minCutoff, maxCutoff, by = 0.3)))[1]
maxDS_bar<-levels(cut(NDS,breaks=seq(minCutoff, maxCutoff, by = 0.3)))[32]
#Generate data frame - seems stupid, but makes sense
#when the "real" data is used :-)
NDSdataframe<-cbind(as.data.frame(NDS_CUT),rep(factor("State-1"),length(NDS_CUT)))
colnames(NDSdataframe)<-c("Score","Action")
DSdataframe<-cbind(as.data.frame(DS_CUT),rep(factor("State-2"),length(DS_CUT)))
colnames(DSdataframe)<-c("Score","Action")
fulldataframe<-rbind(NDSdataframe,DSdataframe)
attach(fulldataframe)
#Plot the full distribution of NDS & DS
# with geom_vline(xintercept = minDS_bar) + geom_vline(xintercept = maxDS_bar)
# that unfortunately does not show :-(
fullplot<-ggplot(fulldataframe, aes(Score, fill=factor(Action,levels=c("State-2","State-1")))) + geom_bar(position="stack") + opts(axis.text.x = theme_text(angle = 45)) + opts (legend.position = "none") + xlab("Scoreinterval") + ylab("Antal pr. interval") + geom_vline(xintercept = minDS_bar) + geom_vline(xintercept = maxDS_bar)
#Generate dataframe for DS only
#It might seem stupid, but again makes sense
#when using the original data :-)
DSdataframe2<-cbind(as.data.frame(DS_CUT),rep(factor("State-2"),length(DS_CUT)))
colnames(DSdataframe2)<-c("theScore","theAction")
#Calucate max number of observations to adjust bar plot of DS only
myMax<-max(table(DSdataframe2))+1
attach(DSdataframe2)
#Generate bar plot of DS only
subplot<-ggplot(fulldataframe, aes(theScore, fill=factor(theAction))) + geom_bar (position="stack") + opts(axis.text.x = theme_text(angle = 45)) + opts(legend.position = "none") + ylim(0, myMax) + xlab("Scoreinterval") + ylab("Antal pr. interval")
#plot on a grid
grid.newpage()
pushViewport(viewport(layout = grid.layout(2, 1)))
vplayout <- function(x, y)
viewport(layout.pos.row = x, layout.pos.col = y)
print(fullplot, vp = vplayout(1, 1))
print(subplot, vp = vplayout(2, 1))
#detach dataframes
detach(DSdataframe2)
detach(fulldataframe)
此外,如果有人知道如何对齐绘图,以便在网格图上正确的间隔低于/高于对方
希望有人能够提供帮助!
提前致谢,
基督教
答案 0 :(得分:3)
在aes
图层的xintercept
周围包裹geom_vline
:
... + geom_vline(aes(xintercept = minDS_bar)) + geom_vline(aes(xintercept = maxDS_bar))
答案 1 :(得分:1)
问题1:
由于您提供垂直线作为数据,您必须首先使用aes()
映射美学fullplot <-ggplot(
fulldataframe,
aes(Score, fill=factor(Action,levels=c("State-2","State-1")))) +
geom_bar(position="stack") +
opts(axis.text.x = theme_text(angle = 45)) +
opts (legend.position = "none") +
xlab("Scoreinterval") +
ylab("Antal pr. interval") +
geom_vline(aes(xintercept = minDS_bar)) +
geom_vline(aes(xintercept = maxDS_bar))
第二个问题:
要对齐绘图,可以使用包ggExtra中的align.plots()函数
install.packages("dichromat")
install.packages("ggExtra", repos="http://R-Forge.R-project.org")
library(ggExtra)
ggExtra::align.plots(fullplot, subplot)