一年内排名的ggplot / scatterplot与不同年份的排名

时间:2015-11-15 11:02:46

标签: r ggplot2 reshape2

这似乎相当简单。如果你知道的话,请指点我。使用包melt()的{​​{1}}函数(作者:Hadley Wickham)以及包reshape2的{​​{1}}命令(作者:Hadley Wickham),以长格式排列数据,< strong>我想绘制一个变量的id,按2009年的等级排序,与2007年的排名。

我最好的一击:

ggplot()

enter image description here

在上图中,点位于45度线而不是ggplot2交叉点(例如汇丰银行,汇丰银行)。 id按照预期的沿x轴排序,但沿y轴的顺序相反

注意:我的最终目的是制作一个气泡图表,其点大小与值的比例以及圆圈旁边打印的变量标签和值。

数据

ggplot(data = df, aes(
    x = reorder(subset(id, year == "2007"), subset(rank, year == "2007")), 
    y = reorder(subset(id, year == "2009"), subset(rank, year == "2009")))) + 
geom_point()

2 个答案:

答案 0 :(得分:3)

我认为主要的问题是你的数据太长&#34;格式,即您的x值(2007年排名)和y值(2009年排名)最终在同一列中。也许您很容易在帖子中未显示的数据按摩步骤中更改此上游。

无论如何,给出帖子中的数据,我会先将其转换为更宽的格式(此处使用data.table::dcast),以便在单独的列中包含x和y值:

library(data.table)
df2 <- dcast(setDT(df), id ~ year, value.var = c("value", "rank"))
head(df2)
#                 id value_2007 value_2009 rank_2007 rank_2009
# 1:     BNP Paribas        108       32.5         7         6
# 2:        Barclays         91        7.4        10        14
# 3:       Citigroup        255       19.0         1        10
# 4: Credit Agricole         67       17.0        14        11
# 5:   Credit Suisse         75       27.0        13         7
# 6:   Deutsche Bank         76       10.3        12        13

然后绘图是相当简单的:

ggplot(data = df2, aes(x = rank_2007, y = rank_2009, label = id)) +        
  geom_text(vjust = 1) +
  geom_point(aes(size = value_2007), alpha = 0.2) +
  geom_point(aes(size = value_2009), alpha = 0.2)

enter image description here

当然,很多可能用于美化(标签定位,点大小等等),但这是另一个故事。

答案 1 :(得分:1)

lukeA(在评论部分),Henrik为我的问题提供了一些很好的建议和答案。谢谢!在这里,我想展示一下,作为一个后续行动,我如何能够将他们的建议结合起来制作带有等级相关视觉的气泡图:

enter image description here

enter image description here

第一个图使用geom_point()结合coloursize,而第二个图使用geom_point()结合fillsize代替,使用shape = 21参数获取图例中打印的空心圆。我发现充满黑色的size传奇气泡在视觉上有点压倒性。

由于某种原因,传说中的name参数没有打印,这是我之前没有遇到的情况,我无法解释。或许需要对颜色和形状进行更多调整......欢迎评论!

df <- structure(list(id = c("HSBC", "JP Morgan", "Santander", "UBS", 
"Goldman Sachs", "BNP Paribas", "Credit Suisse", "Unicredit", 
"Societe Generale", "Citigroup", "Credit Agricole", "Morgan Stanley", 
"Deutsche Bank", "Barclays", "Royal Bank of Scotland"), value.2007 = c(215L, 
165L, 116L, 116L, 100L, 108L, 75L, 93L, 80L, 255L, 67L, 49L, 
76L, 91L, 120L), value.2009 = c(97, 85, 64, 35, 35, 32.5, 27, 
26, 26, 19, 17, 16, 10.3, 7.4, 4.6), rank.2007 = c(2, 3, 6, 5, 
8, 7, 13, 9, 11, 1, 14, 15, 12, 10, 4), rank.2009 = c(1, 2, 3, 
4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)), .Names = c("id", 
"value.2007", "value.2009", "rank.2007", "rank.2009"), row.names = c(15L, 
14L, 12L, 9L, 11L, 7L, 10L, 8L, 5L, 13L, 4L, 1L, 3L, 6L, 2L), class = "data.frame")


## Comments:
# 1. properly scale bubbles in bubble-chart:
v1 <- min(df$value.2007, df$value.2009)
v2 <- max(df$value.2007, df$value.2009)
# use + scale_size(range = c(v1, v2)/10) or similar
# 2. increase the size of points in the legend
# with + guides(colour = guide_legend(override.aes = list(size = 10)))
# 3. add a name to the legend guides: FAIL!
# + scale_size(name = "Market Cap ($bn)", range = c(v1, v2)/10)
# + scale_color_manual(name = "Year", values = c("royalblue", "forestgreen")) 

# Version 1. solid shape with colour
library("ggplot2")
library("scales")
p <- ggplot(data = df, aes(x = rank.2007, y = rank.2009, label = id)) +        
    geom_point(aes(size = value.2007, colour = "2007"), alpha = 0.8) +
    geom_point(aes(size = value.2009, colour = "2009"), alpha = 0.8) +
    geom_text(size = 4, vjust = -5) +
    scale_x_continuous(limits = c(-1, 17), breaks = seq(1, 16, 2)) +
    scale_y_continuous(limits = c(-1, 17), breaks = seq(1, 16, 2)) +
    coord_fixed() +
    scale_color_manual(values = c("royalblue", "forestgreen")) +
    scale_size(range = c(v1, v2)/10) +
    guides(colour = guide_legend(override.aes = list(size = 10))) +
    theme_bw() +
    xlab("Rank by Market Capitalization in 2007") +
    ylab("Rank by Market Capitalization in 2009") +
    ggtitle("Market Capitalization Before and After the Crisis \n(Selected Banks: 2009 versus 2007)") +
    theme(legend.position = "right", legend.direction = "vertical") +
    theme(legend.title = element_blank()) +
    theme(legend.key = element_blank())
p

ggsave(p, file = "p1.jpg", width = 12, height = 10)

# Version 2: hollow shape with fill
library("ggplot2")
library("scales")
p <- ggplot(data = df, aes(x = rank.2007, y = rank.2009, label = id)) +        
    geom_point(aes(size = value.2007, fill = "2007"), 
               shape = 21, alpha = 0.8) +
    geom_point(aes(size = value.2009, fill = "2009"), 
               shape = 21, alpha = 0.8) +
    geom_text(size = 4, vjust = -5) +
    scale_size(name = "Market Cap ($bn)", range = c(v1, v2)/10) +
    scale_shape(solid = FALSE) + # combined with shape=21
    scale_x_continuous(limits = c(-1, 17), breaks = seq(1, 16, 2)) +
    scale_y_continuous(limits = c(-1, 17), breaks = seq(1, 16, 2)) +
    coord_fixed() +
    scale_fill_manual(name = "Year", values = c("royalblue", "forestgreen")) +
    guides(fill = guide_legend(override.aes = list(size = 10))) +
    theme_bw() +
    xlab("Rank by Market Capitalization in 2007") +
    ylab("Rank by Market Capitalization in 2009") +
    ggtitle("Market Capitalization Before and After the Crisis \n(Selected Banks: 2009 versus 2007)") +
    theme(legend.position = "right", legend.direction = "vertical") +
    theme(legend.title = element_blank()) +
    theme(legend.key = element_blank())
p

ggsave(p, file = "p2.jpg", width = 12, height = 10)