我正在运行以下ggplot代码:
mu <- ddply(dfcards, "cluster", summarise, grp.mean=mean(CardsBalance)) # Calculate means
ggplot(dfcards[dfcards$cluster==1,], aes(x=CardsBalance, color = qrtClusIncome,
fill= qrtClusIncome)) +
geom_histogram(aes(y=..density..), position="identity", alpha=0.9, bins = 200)+
geom_density(alpha=0.6, size = 2)+
geom_vline(data=mu, aes(xintercept=grp.mean, color= qrtClusIncome ),
linetype="dashed", size = 1.5)+
labs(title="Distribution of Credit Cards Balance per Income quintile in Cluster 1",x="Cards Balance",
y = "Density")
我收到错误:
Error in eval(expr, envir, enclos) : object 'qrtClusIncome' not found
正如您所看到的,'qrtClusIncome'是数据框中的一个因子列(从末尾开始的3d),它将数据输入到ggplot函数:
> str(dfcards[dfcards$cluster==1,])
'data.frame': 11200 obs. of 55 variables:
$ cluster : int 1 1 1 1 1 1 1 1 1 1 ...
$ Collateral : num 0 0 0 0 0 0 0 0 0 0 ...
$ TotalCredit : num 575.9 982.5 85 5970.4 47.6 ...
$ TotalCScore : num 693 677 673 723 699 680 680 678 699 692 ...
$ CarBalance : num 0 0 0 0 0 0 0 0 0 0 ...
$ CardsBalance : num 575.9 982.5 85 0 47.6 ...
$ ConsumerBalance : num 0 0 0 5970 0 ...
$ MortgageBalance : num 0 0 0 0 0 0 0 0 0 0 ...
$ Gender : Factor w/ 2 levels "0","1": 2 2 2 1 1 2 1 1 1 1 ...
$ Age : num 37 39 36 35 27 35 37 32 33 31 ...
$ Profession : chr "Missing" "Bank Employee" "Bank Employee" "Missing" ...
$ Lifetime : num 5 8 10 7 7 10 12 6 10 8 ...
$ Owner : Factor w/ 2 levels "0","1": 2 2 1 2 1 1 2 1 1 1 ...
$ Income : num 1e+06 1e+06 1e+06 1e+05 1e+05 ...
$ viotiko : num 8 0 0 8 6 8 8 8 0 7 ...
$ pd_1year : num 0.00843 0.00843 0.00843 0.00843 0.00843 ...
$ pd_1year_group : chr "<1%" "<1%" "<1%" "<1%" ...
$ viot_0 : num 0 1 1 0 0 0 0 0 1 0 ...
$ viot_1 : num 0 0 0 0 0 0 0 0 0 0 ...
$ viot_2 : num 0 0 0 0 0 0 0 0 0 0 ...
$ viot_3 : num 0 0 0 0 0 0 0 0 0 0 ...
$ viot_4 : num 0 0 0 0 0 0 0 0 0 0 ...
$ viot_5 : num 0 0 0 0 0 0 0 0 0 0 ...
$ viot_6 : num 0 0 0 0 1 0 0 0 0 0 ...
$ viot_7 : num 0 0 0 0 0 0 0 0 0 1 ...
$ viot_8 : num 1 0 0 1 0 1 1 1 0 0 ...
$ Bank_Employee : Factor w/ 2 levels "0","1": 1 2 2 1 1 1 1 1 1 1 ...
$ Businessman : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Doctor : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Engineer : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Farmer : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Housewife : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Independent : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Lawyer : Factor w/ 2 levels "0","1": 1 1 1 1 1 2 1 1 1 1 ...
$ Missing : Factor w/ 2 levels "0","1": 2 1 1 2 1 1 2 2 1 2 ...
$ Pensioner : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ PrSec_Employee : Factor w/ 2 levels "0","1": 1 1 1 1 2 1 1 1 1 1 ...
$ PubSec_Employee : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Self_Employed : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Student : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Tradesman : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ Unemployed : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 2 1 ...
$ OtherProf : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ qrtCollateral : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ qrtTotalCScore : Factor w/ 10 levels "1","2","3","4",..: 8 3 1 10 9 4 4 3 9 8 ...
$ qrtCarBalance : Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
$ qrtCardsBalance : Factor w/ 8 levels "1","2","3","4",..: 7 8 5 2 4 7 7 7 4 5 ...
$ qrtConsumerBalance: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
$ qrtMortgageBalance: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
$ qrtAge : Factor w/ 10 levels "1","2","3","4",..: 2 3 2 2 1 2 2 1 1 1 ...
$ qrtLifetime : Factor w/ 9 levels "1","2","3","4",..: 2 3 4 3 3 4 5 2 4 3 ...
$ qrtIncome : Factor w/ 2 levels "1","2": 2 2 2 2 2 2 2 2 1 1 ...
$ qrtClusIncome : Factor w/ 5 levels "1","2","3","4",..: 5 5 5 5 5 5 5 5 2 2 ...
$ MaxBalance : num 39508 39508 39508 39508 39508 ...
$ IncrCardsBal : num 38932 38526 39423 39508 39461 ..
当我使用另一个变量时 - 即'as.factor(cluster)' - 代码可以顺利运行。
你怎么解释这个?我应该在代码中更改什么?
您的建议将不胜感激。
答案 0 :(得分:0)
您的数据框中有什么mu
。看起来错误可能在行
geom_vline(data=mu, aes(xintercept=grp.mean, color= qrtClusIncome ),
linetype="dashed", size = 1.5)
从我所看到的代码中mu
不包含qrtClusIncome
,因此上述行会出错。