我在尝试使用ggplot时看到了一些奇怪的行为。
我无法使用示例数据集重新创建问题,因为我无法确定正在使用的数据集的问题所在。从本质上讲,我有两个来自同一数据集的变量,而aes被应用于一个变量,而不是另一个变量。
这是数据帧:temp
temp
# A tibble: 504 x 5
# Groups: continent [6]
continent year urban.pop predicted.estimated.pop pop
<chr> <int> <dbl> <chr> <dbl>
1 Africa 1950 32658962 estimated.pop 32658962
2 Africa 1955 41419217 estimated.pop 41419217
3 Africa 1960 53008425 estimated.pop 53008425
4 Africa 1965 66348577 estimated.pop 66348577
5 Africa 1970 82637370 estimated.pop 82637370
6 Africa 1975 103198989 estimated.pop 103198989
7 Africa 1980 128615954 estimated.pop 128615954
8 Africa 1985 160721947 estimated.pop 160721947
9 Africa 1990 200111296 estimated.pop 200111296
10 Africa 1995 241824184 estimated.pop 241824184
我想将此数据框绘制如下:
ggplot(temp, aes(x = year, y = pop, col = continent, linetype = predicted.estimated.pop)) +
geom_line()
这看起来不错,但是当我更改y轴以绘制urban.pop
时,得到以下结果,其中线型aes尚未应用:
ggplot(temp, aes(x = year, y = urban.pop, col = continent, linetype = predicted.estimated.pop)) +
geom_line()
从上面可以看到,pop和urban.pop都是类:dbl。它们也相同:
sum(temp$pop - temp$urban.pop, na.rm = T)
[1] 0
我唯一需要注意的是temp
是一个分组的df:
str(temp)
Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 504 obs. of 5 variables:
$ continent : chr "Africa" "Africa" "Africa" "Africa" ...
$ year : int 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 ...
$ urban.pop : num 32658962 41419217 53008425 66348577 82637370 ...
$ predicted.estimated.pop: chr "estimated.pop" "estimated.pop" "estimated.pop" "estimated.pop" ...
$ pop : num 32658962 41419217 53008425 66348577 82637370 ...
- attr(*, "vars")= chr "continent"
- attr(*, "drop")= logi TRUE
- attr(*, "indices")=List of 6
..$ : int 0 1 2 3 4 5 6 7 8 9 ...
..$ : int 21 22 23 24 25 26 27 28 29 30 ...
..$ : int 42 43 44 45 46 47 48 49 50 51 ...
..$ : int 63 64 65 66 67 68 69 70 71 72 ...
..$ : int 84 85 86 87 88 89 90 91 92 93 ...
..$ : int 105 106 107 108 109 110 111 112 113 114 ...
- attr(*, "group_sizes")= int 84 84 84 84 84 84
- attr(*, "biggest_group_size")= int 84
- attr(*, "labels")='data.frame': 6 obs. of 1 variable:
..$ continent: chr "Africa" "Asia" "Europe" "LAC" ...
..- attr(*, "vars")= chr "continent"
..- attr(*, "drop")= logi TRUE
我无法弄清楚为什么这两个变量会为线型aes驱动不同的结果。我需要解决此问题的原因是,我在原始数据集中有另一个变量,其行为方式与urban.pop相同。
有人可以向我解释一下,还是可以帮助解决问题?
答案 0 :(得分:2)
我无法真正重现您的问题,但是我添加了一个与您类似的数据示例。也许通过比较发现了结。
library(ggplot2)
p1 <- ggplot(temp, aes(x=year, y=pop, col=continent,
linetype=predicted.estimated.pop)) +
geom_line()
p2 <- ggplot(temp, aes(x=year, y=urban.pop, col=continent,
linetype=predicted.estimated.pop)) +
geom_line()
egg::ggarrange(p1, p2)
产量:
数据
> dput(temp)
structure(list(continent = c("Africa", "Africa", "Africa", "Africa",
"Africa", "Asia", "Asia", "Asia", "Asia", "Asia", "Europe", "Europe",
"Europe", "Europe", "Europe", "Africa", "Africa", "Africa", "Africa",
"Africa", "Asia", "Asia", "Asia", "Asia", "Asia", "Europe", "Europe",
"Europe", "Europe", "Europe"), year = c(1995, 2000, 2005, 2010,
2015, 1995, 2000, 2005, 2010, 2015, 1995, 2000, 2005, 2010, 2015,
2015, 2020, 2025, 2030, 2035, 2015, 2020, 2025, 2030, 2035, 2015,
2020, 2025, 2030, 2035), urban.pop = c(30806083, 46209124.25,
61612165.5, 77015206.75, 92418248, 105455596, 184545293, 263634990,
342724687, 421814384, 24760494, 37140741, 49520988, 61901235,
74281482, 92418248, 115522810, 138627372, 161731934, 184836496,
421814384, 527267980, 632721576, 738175172, 843628768, 74281482,
92851852.5, 111422223, 129992593.5, 148562964), predicted.estimated.pop = c("estimated.pop",
"estimated.pop", "estimated.pop", "estimated.pop", "estimated.pop",
"estimated.pop", "estimated.pop", "estimated.pop", "estimated.pop",
"estimated.pop", "estimated.pop", "estimated.pop", "estimated.pop",
"estimated.pop", "estimated.pop", "predicted.pop", "predicted.pop",
"predicted.pop", "predicted.pop", "predicted.pop", "predicted.pop",
"predicted.pop", "predicted.pop", "predicted.pop", "predicted.pop",
"predicted.pop", "predicted.pop", "predicted.pop", "predicted.pop",
"predicted.pop"), pop = c(30806083, 46209124.25, 61612165.5,
77015206.75, 92418248, 105455596, 184545293, 263634990, 342724687,
421814384, 24760494, 37140741, 49520988, 61901235, 74281482,
92418248, 115522810, 138627372, 161731934, 184836496, 421814384,
527267980, 632721576, 738175172, 843628768, 74281482, 92851852.5,
111422223, 129992593.5, 148562964)), row.names = c(NA, -30L), class = "data.frame")
> str(temp)
'data.frame': 30 obs. of 5 variables:
$ continent : chr "Africa" "Africa" "Africa" "Africa" ...
$ year : num 1995 2000 2005 2010 2015 ...
$ urban.pop : num 30806083 46209124 61612166 77015207 92418248 ...
$ predicted.estimated.pop: chr "estimated.pop" "estimated.pop" "estimated.pop" "estimated.pop" ...
$ pop : num 30806083 46209124 61612166 77015207 92418248 ...