r:在ggplot斜率图中自动错开重叠标签

时间:2017-11-21 01:15:17

标签: r ggplot2

在使用ggplot2创建一个斜率图时,如下所示,我发现当我们的数据点靠近时,我的许多标签会重叠。如果有重叠,如何更改标签以自动错开标签?

library(ggplot2)
library(scales)
install.packages("Lock5Data", repos = "http://cran.us.r-project.org")  # you might need this
library(Lock5Data)
data("NBAStandings1e")
data("NBAStandings2016")


colnames(NBAStandings1e)[4] <- "year1"    # 2010-2011
colnames(NBAStandings2016)[4] <- "year2"  # 2015-2016
nba_df <- merge(NBAStandings1e[,c('Team','year1')], NBAStandings2016[,c('Team','year2')])
scale <- dim(nba_df)[1] 

a<-nba_df
p<-ggplot(nba_df) + geom_segment(aes(x=0,xend=scale,y=year1,yend=year2),size=.75)

# clear junk
p<-p + theme(panel.background = element_blank())
p<-p + theme(panel.grid=element_blank())
p<-p + theme(axis.ticks=element_blank())
# p<-p + theme(axis.text=element_blank())
p<-p + theme(panel.border=element_blank())
# p<-p + theme(panel.grid.major = element_line(linetype = "dashed", fill = NA))
p<-p + theme(panel.grid.major = element_line(linetype = "dashed",color = "grey80"))
p<-p + theme(panel.grid.major.x = element_blank())
p<-p + theme(axis.text.x = element_blank())


# annotate
p<-p + xlab("") + ylab("Percentage Wins")
p<-p + xlim((-5),(scale+12))
p<-p + geom_text(label="2010-2011 Season", x=0,     y=(1.1*(max(a$year2,a$year1))),hjust= 1.2,size=3)
p<-p + geom_text(label="2015-2016 Season", x=months,y=(1.1*(max(a$year2,a$year1))),hjust=-0.1,size=3)
p<-p + geom_text(label=nba_df$Team, y=nba_df$year2, x=rep.int(scale,dim(a)[1]),hjust=-0.2,size=2)
p<-p + geom_text(label=nba_df$Team, y=nba_df$year1, x=rep.int( 0,dim(a)[1]),hjust=1.2,size=2)
p

1 个答案:

答案 0 :(得分:4)

由于重叠的球队具有相同的胜率,因此您可以通过组合具有相同胜率的球队的标签来更简单地处理重叠。我还对您的代码进行了一些其他更改,旨在简化流程。

library(Lock5Data)
library(tidyverse)
library(scales)

data("NBAStandings1e")
data("NBAStandings2016")
colnames(NBAStandings1e)[4] <- "2010-11"    # 2010-2011
colnames(NBAStandings2016)[4] <- "2015-16"  # 2015-2016
nba_df <- merge(NBAStandings1e[,c('Team','2010-11')], NBAStandings2016[,c('Team','2015-16')])

# Convert data to long format
dat = gather(nba_df, Season, value, -Team) 

# Combine labels for teams with same winning percentage (see footnote * below)
dat_lab = dat %>% group_by(Season, value) %>% 
  summarise(Team = paste(Team, collapse="\U2014"))  # \U2014 is the emdash character

ggplot(dat, aes(Season, value, group=Team)) +
  geom_line() +
  theme_minimal() + theme(panel.grid.minor=element_blank()) +
  labs(y="Winning Percentage") +
  scale_y_continuous(limits=c(0,1), labels=percent) +
  geom_text(data=subset(dat_lab, Season=="2010-11"), aes(label=Team, x=0.98), hjust=1, size=2) +
  geom_text(data=subset(dat_lab, Season=="2015-16"), aes(label=Team, x=2.02), hjust=0, size=2)

enter image description here

这是标签外观的特写:

enter image description here

*如果由于拥有非常接近但不相等的胜率,有些球队重叠,你仍然可以通过四舍五入来对它们进行分组。例如,如果您希望在四舍五入到最近的2%时将获胜百分比的团队分组,您可以这样做:

dat_lab = dat %>% group_by(Season, group=round(value/0.02)*0.02) %>% 
  summarise(Team = paste(Team, collapse="\U2014"),
            value = mean(value))

这会导致标签被放置在其组的平均value