填充值未映射到正确的因子ggplot

时间:2020-04-28 16:56:51

标签: r ggplot2

我正在尝试绘制由变量着色的散点图,并使用geom_rect填充x轴上的值。但是我无法弄清楚如何使因子以正确的顺序映射。

以下是我的数据示例:

head(prod_cons_diff, n = 10)
# A tibble: 10 x 10
   country             year cons.e iso3c terr.e diff.prod.cons.e prod.cons                 continent xstart  xend
   <chr>              <int>  <dbl> <chr>  <dbl>            <dbl> <chr>                     <chr>      <dbl> <dbl>
 1 China               2017  2333. CHN    2685.           352.   Territorial > Consumption Asia         0.5   1.5
 2 USA                 2017  1552. USA    1439.          -113.   Consumption > Territorial Americas     1.5   2.5
 3 India               2017   617. IND     671.            53.8  Territorial > Consumption Asia         2.5   3.5
 4 Japan               2017   380. JPN     324.           -55.9  Consumption > Territorial Asia         3.5   4.5
 5 Russian Federation  2017   375. RUS     450.            74.9  Territorial > Consumption Europe       4.5   5.5
 6 Germany             2017   244. DEU     218.           -26.4  Consumption > Territorial Europe       5.5   6.5
 7 South Korea         2017   183. KOR     175.            -7.79 Consumption > Territorial Asia         6.5   7.5
 8 Saudi Arabia        2017   169. SAU     173.             3.62 Territorial > Consumption Asia         7.5   8.5
 9 Iran                2017   166. IRN     187.            20.8  Territorial > Consumption Asia         8.5   9.5
10 Indonesia           2017   164. IDN     159.            -4.62 Consumption > Territorial Asia         9.5  10.5

当我运行以下ggplot脚本时:

ggplot(prod_cons_diff, aes(x = fct_reorder(country, diff.prod.cons.e), y = diff.prod.cons.e * 3.664)) + 
  geom_point(aes(col = prod.cons)) + # add geom_point otherwise i can't map geom_rect (continuous) to country (discrete)
  geom_rect(aes(ymin = -1500, ymax = 1500, 
                xmin = xstart, xmax = xend, 
                fill = continent), alpha = 0.3, col = NA) + 
  geom_point(aes(col = prod.cons)) + # re-add geom_point so that it appears on top of the fill
  geom_hline(yintercept = 0, linetype = 'dashed') +
  coord_flip() +
  scale_color_manual(values = c('red', 'blue')) + 
  theme_minimal()

enter image description here

但是fill变量显然是错误的:中国不在欧洲,美国不在亚洲等。

我尝试将国家和大陆设置为具有特定级别的因素,但是无法正确完成。我还尝试使用as_factor()中的forcats从此处(mapping (ordered) factors to colors in ggplot)回答2,但是找不到函数。 as_factor()似乎位于sjlabelledhttps://www.rdocumentation.org/packages/sjlabelled/versions/1.1.3/topics/as_factor)中,但这也不起作用。

我尝试制作一个简单的可复制示例,但是在那里这些因素可以正确映射。本质上,我无法确切地知道该因素如何在整个洲和整个国家/地区呈现水平。

我想有一个简单的解决方案,但是我一直在用这种方法把头撞在墙上。

回应@Matt在下面的评论:

> dput(head(prod_cons_diff, n = 10))
structure(list(country = c("China", "USA", "India", "Japan", 
"Russian Federation", "Germany", "South Korea", "Saudi Arabia", 
"Iran", "Indonesia"), year = c(2017L, 2017L, 2017L, 2017L, 2017L, 
2017L, 2017L, 2017L, 2017L, 2017L), cons.e = c(2333.11521896672, 
1552.00682401808, 616.7239620176, 380.216883675894, 374.633869915012, 
244.223647570196, 182.62081469552, 169.164508003068, 166.402218417086, 
164.032430920609), iso3c = c("CHN", "USA", "IND", "JPN", "RUS", 
"DEU", "KOR", "SAU", "IRN", "IDN"), terr.e = c(2685.24946186172, 
1438.52306916917, 670.566180528622, 324.269234030281, 449.519945642447, 
217.785589557643, 174.832142238684, 172.780926461956, 187.211971723987, 
159.409240780077), diff.prod.cons.e = c(352.134242894999, -113.483754848911, 
53.8422185110221, -55.9476496456134, 74.8860757274351, -26.4380580125526, 
-7.78867245683526, 3.61641845888749, 20.8097533069009, -4.62319014053256
), prod.cons = c("Territorial > Consumption", "Consumption > Territorial", 
"Territorial > Consumption", "Consumption > Territorial", "Territorial > Consumption", 
"Consumption > Territorial", "Consumption > Territorial", "Territorial > Consumption", 
"Territorial > Consumption", "Consumption > Territorial"), continent = c("Asia", 
"Americas", "Asia", "Asia", "Europe", "Europe", "Asia", "Asia", 
"Asia", "Asia"), xstart = c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 
7.5, 8.5, 9.5), xend = c(1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 
9.5, 10.5)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-10L))

1 个答案:

答案 0 :(得分:1)

当您基于对数据集重新排序之前定义的x个值定义the JVM instruction set时,您的值将不再与新的排序匹配。

因此,您需要为geom_rect重新计算xstartxend的位置,以匹配数据集的新顺序。

以下是使用geom_rect管道顺序执行此操作的可能解决方案:

dplyr

因此,现在,如果您将这些新职位传递到library(dplyr) df %>% arrange(diff.prod.cons.e) %>% mutate(country = factor(country, unique(country)), continent = factor(continent, unique(continent))) %>% mutate(xstart2 = row_number() - 0.5, xend2 = row_number()+0.5) country year cons.e iso3c terr.e diff.prod.cons.e prod.cons continent xstart xend xstart2 xend2 1 USA 2017 1552 USA 1439 -113.00 Consumption>Territorial Americas 1.5 2.5 0.5 1.5 2 Japan 2017 380 JPN 324 -55.90 Consumption>Territorial Asia 3.5 4.5 1.5 2.5 3 Germany 2017 244 DEU 218 -26.40 Consumption>Territorial Europe 5.5 6.5 2.5 3.5 4 South_Korea 2017 183 KOR 175 -7.79 Consumption>Territorial Asia 6.5 7.5 3.5 4.5 5 Indonesia 2017 164 IDN 159 -4.62 Consumption>Territorial Asia 9.5 10.5 4.5 5.5 6 Saudi_Arabia 2017 169 SAU 173 3.62 Territorial>Consumption Asia 7.5 8.5 5.5 6.5 7 Iran 2017 166 IRN 187 20.80 Territorial>Consumption Asia 8.5 9.5 6.5 7.5 8 India 2017 617 IND 671 53.80 Territorial>Consumption Asia 2.5 3.5 7.5 8.5 9 Russian_Federation 2017 375 RUS 450 74.90 Territorial>Consumption Europe 4.5 5.5 8.5 9.5 10 China 2017 2333 CHN 2685 352.00 Territorial>Consumption Asia 0.5 1.5 9.5 10.5 中,则可以获取各大洲的正确着色图案:

geom_rect

enter image description here