ggparcoord中的多个颜色线与facet_wrap

时间:2014-12-17 00:04:02

标签: r ggplot2 ggally

我的数据框格式如下:

Month1  Month2  Month3  Month4  Month5  Month6  Month7  Month8  Month9  Month10 Month11 Month12 Month13 Month14 Month15 Type    Subject
2.5617749   2.3900798   2.4261968   3.2463769   2.8622897   2.9429682   3.3301257   2.5712439   2.1379820   2.1297074   1.8171952   1.3065964   0.6729969   0.2342636   0.2643012   Filing 1    Tools of the Trade
2.6787155   3.3005452   3.2765383   3.2594204   3.1994482   2.9489934   3.0170951   2.9648050   2.5933965   2.7525476   2.6949229   2.7816262   2.6125091   2.7238804   2.4219048   Filing 1    Who's at the Door?
1.3769416   1.7417689   1.5411681   1.6315268   1.4034428   2.0020882   1.5563825   1.1329947   1.1466544   1.4037866   1.2279484   1.0863116   1.1081301   0.9657535   0.9496937   ProcessServing 1    Adobe Acrobat
1.5634082   1.9899706   1.8965844   2.0455116   2.0640787   1.8585767   1.4652345   1.5646704   0.9417121   1.5804423   1.3644140   0.8991399   0.8865172   1.4111854   1.1476721   ProcessServing 1    EService

这只是示例数据,我总共有4个Type和7个Subject类别。这是dput(head(avgRevenueBySubject))的输出:

structure(list(Month1 = c(2.32452852540217, 2.39838024319443, 
1.38763119669326, 1.67197010492586, 2.39230240910008, 2.56177491674571
), Month2 = c(2.25983235807464, 2.80008703157276, 1.92684823894878, 
1.81781945992201, 3.11274605464608, 2.39007978845121), Month3 = c(2.45378041585838, 
2.73603115114115, 2.15154625461568, 2.28897180500678, 3.2072070366587, 
2.42619683055328), Month4 = c(2.50950054817085, 2.89118356394795, 
2.19502538520019, 2.28141567102663, 3.0504767706406, 3.24637686954766
), Month5 = c(2.53858195315855, 2.5939498734771, 2.35786859462019, 
2.24828684346212, 3.02313315871281, 2.86228969522596), Month6 = c(2.20551945443653, 
2.11372073519497, 2.24466703665554, 2.17193033864937, 2.70377966653074, 
2.94296818999896), Month7 = c(2.09246043688626, 2.50841794197685, 
2.30673064217475, 1.91220323933604, 2.7369954829105, 3.33012570803583
), Month8 = c(2.22553189078165, 2.44113695766249, 2.26140266497664, 
1.764621178248, 2.62183982786095, 2.57124386952199), Month9 = c(1.99424045532198, 
1.9091795918852, 2.20375474567921, 1.75651288161892, 2.40383936923673, 
2.13798204834703), Month10 = c(2.15229842709522, 2.52246522784505, 
2.01002146553544, 1.74130180371386, 2.53194432666157, 2.12970742947938
), Month11 = c(2.26866642573734, 2.21939880654197, 1.96811894944027, 
1.54314755700399, 2.81563101112808, 1.81719515748861), Month12 = c(2.21540768941806, 
2.09996453939828, 2.14269489036386, 1.69009446249139, 2.52435113546707, 
1.30659644388318), Month13 = c(2.01407795696169, 2.19110438349199, 
2.08594091270487, 1.66310710284536, 2.30479375587374, 0.672996949673077
), Month14 = c(1.85702016208139, 2.18375170870693, 2.28394628775105, 
1.64612604028705, 2.51616863736761, 0.234263615828042), Month15 = c(1.7562791061015, 
2.38349140169948, 1.96156382849473, 1.78529825283472, 2.36734279344632, 
0.264301216598792), Type = structure(c(2L, 2L, 2L, 2L, 2L, 2L
), .Label = c("eServices 1", "Filing 1", "ProcessServing 1", 
"Research 1"), class = "factor"), Subject = c("Adobe Acrobat", 
"EService", "OCeFiling", "SD eFiling", "Saving Trees & Time", 
"Tools of the Trade")), .Names = c("Month1", "Month2", "Month3", 
"Month4", "Month5", "Month6", "Month7", "Month8", "Month9", "Month10", 
"Month11", "Month12", "Month13", "Month14", "Month15", "Type", 
"Subject"), row.names = c(NA, 6L), class = "data.frame")

我尝试使用以下代码绘制此信息:

q <- ggparcoord(data = avgRevenueBySubject,
                columns = 1:15, 
                groupColumn = 17, 
                showPoints = FALSE, 
                alphaLines = 0.3,
                shadeBox = NULL,
                scale = "globalminmax",
                title = "Average Revenue by Training Subject"
)  +
  geom_vline(aes(xintercept=3.5),color='blue',linetype="dashed", size=1) +
  facet_wrap(~Subject,scales='fixed', nrow = 4) + geom_line(size=1)
q <- q + theme_minimal() + xlab('Months') + ylab('Average Revenue (on log scale)') +
  theme(legend.position = "none") + theme(axis.text.y = element_text(hjust=0, angle=0), 
                                          axis.text.x = element_text(hjust=1, angle=45),
                                          plot.title = element_text(size=20))
q

我得到以下情节:

enter image description here

现在我们可以看到,我在每个facet中获得不同的颜色,并在每个单独的地块中获得相同的颜色。

我希望在每个单独的地块上看到4条线的不同颜色,并且这些线的颜色在个别facet中是相同的。

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:1)

据我所知,ggparcoord从不使用的数据集中删除列。因此,如果您想使用未在ggparcoord()中引用的构面中的变量,那么您将遇到问题。

一种解决方法是直接修改ggplot对象中的数据。通常情况下,我说这是一个坏主意,但现在我还没有看到任何其他方式。

q<-ggparcoord(data = avgRevenueBySubject,
                columns = 1:15,  
                showPoints = FALSE, 
                alphaLines = 0.3,
                groupColumn="Type",
                shadeBox = NULL,
                scale = "globalminmax",
                title = "Average Revenue by Training Subject"
)
# data to merge
mm <- cbind.data.frame(.ID=1:nrow(avgRevenueBySubject), Subject=avgRevenueBySubject$Subject)
#merge data
q$data<-merge(q$data, mm)
#finish plot commands
q <- q+ geom_vline(aes(xintercept=3.5),color='blue',linetype="dashed", size=1) +
    facet_wrap(~Subject,scales='fixed', nrow = 4) + geom_line(size=1)
q <- q + theme_minimal() + xlab('Months') + ylab('Average Revenue (on log scale)') +
      theme(legend.position = "none") + theme(axis.text.y = element_text(hjust=0, angle=0), 
          axis.text.x = element_text(hjust=1, angle=45),
          plot.title = element_text(size=20))