通过变量之间的差异来排序条

时间:2018-02-05 16:02:50

标签: r ggplot2 tidyverse

我的目的是绘制条形图,并显示变量: "HH_FIN_EX", "ACT_IND_CON_EXP"但按变量diff按升序排序。 diff本身不应包含在图表

library(eurostat)
library(tidyverse)

#getting the data
data1 <- get_eurostat("nama_10_gdp",time_format = "num")

#filtering
data_1_4 <- data1 %>% 
        filter(time=="2016", 
               na_item %in% c("B1GQ", "P31_S14_S15", "P41"), 
               geo %in% c("BE","BG","CZ","DK","DE","EE","IE","EL","ES","FR","HR","IT","CY","LV","LT","LU","HU","MT","NL","AT","PL","PT","RO","SI","SK","FI","SE","UK"), 
               unit=="CP_MEUR")%>% select(-unit, -time)

#transformations and calculations
data_1_4 <- data_1_4 %>% 
        spread(na_item, values)%>% 
        na.omit() %>% 
        mutate(HH_FIN_EX = P31_S14_S15/B1GQ, ACT_IND_CON_EXP=P41/B1GQ, diff=ACT_IND_CON_EXP-HH_FIN_EX) %>%
        gather(na_item, values,  2:7)%>%
        filter(na_item %in% c("HH_FIN_EX", "ACT_IND_CON_EXP", "diff")) 
#plotting
ggplot(data=data_1_4, aes(x=reorder(geo, values), y=values, fill=na_item))+
        geom_bar(stat="identity", position=position_dodge(), colour="black")+
        labs(title="", x="Countries", y="As percentage of GDP")

enter image description here

我感谢任何有关如何执行此操作的建议,因为aes(x=reorder(geo, values[values=="diff"])会导致错误。

3 个答案:

答案 0 :(得分:2)

首先,在使用diff时,您不应该包含gather(您的结果列),这会让事情变得复杂。
将第gather(na_item, values, 2:7)行更改为gather(na_item, values, 2:6)

您可以使用此代码按降序计算差异和顺序(使用dplyr::arange)行:

plotData <- data_1_4 %>% 
        spread(na_item, values) %>% 
        na.omit() %>% 
        mutate(HH_FIN_EX = P31_S14_S15 / B1GQ, 
               ACT_IND_CON_EXP = P41 / B1GQ, 
               diff = ACT_IND_CON_EXP - HH_FIN_EX) %>%
        gather(na_item, values, 2:6) %>%
        filter(na_item %in% c("HH_FIN_EX", "ACT_IND_CON_EXP")) %>%
        arrange(desc(diff))

用以下方式绘制:

ggplot(plotData, aes(geo, values, fill = na_item))+
    geom_bar(stat = "identity", position = "dodge", color = "black") +
    labs(x = "Countries", 
         y = "As percentage of GDP") +
    scale_x_discrete(limits = plotData$geo)

enter image description here

答案 1 :(得分:0)

您可以明确地找出所需的顺序 - 它存储在下面的country_order中 - 并强制因子geo按此顺序获得其级别。然后在过滤掉ggplot变量后运行diff。因此,请使用以下内容替换您对ggplot的来电:

country_order = (data_1_4 %>% filter(na_item == 'diff') %>% arrange(values))$geo
data_1_4$geo = factor(data_1_4$geo, country_order)
ggplot(data=filter(data_1_4, na_item != 'diff'), aes(x=geo, y=values, fill=na_item))+
  geom_bar(stat="identity", position=position_dodge(), colour="black")+
  labs(title="", x="Countries", y="As percentage of GDP") 

这样做,我得到以下情节:

enter image description here

答案 2 :(得分:-1)

这就是你要找的东西吗?

data_1_4 %>% mutate(Val = fct_reorder(geo, values, .desc = TRUE)) %>%
      filter(na_item %in% c("HH_FIN_EX", "ACT_IND_CON_EXP")) %>%
       ggplot(aes(x=Val, y=values, fill=na_item)) +
        geom_bar(stat="identity", position=position_dodge(), colour="black") +
        labs(title="", x="Countries", y="As percentage of GDP")  

dd