我想重塑我的数据,但由于从宽到长不完全精确,所以无法弄清楚。
这是我的数据样本,按年龄和性别分类人口
data <- structure(list(`2010 - Both Sexes - 0` = 163753, `2010 - Male - 0` = 83878, `2010 - Female - 0` = 79875, `2011 - Both Sexes - 0` = 161923,
`2011 - Male - 0` = 83134, `2011 - Female - 0` = 78789, `2010 - Both Sexes - 1` = 163043,
`2010 - Male - 1` = 83174, `2010 - Female - 1` = 79869, `2011 - Both Sexes - 1` = 163342,
`2011 - Male - 1` = 83472, `2011 - Female - 1` = 79870), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame"))
我想要的数据集如下所示:
age 2010 - Both Sexes 2010 - Male 2010 - Female 2011 - Both Sexes 2011 - Male 2011 - Female ...
0
1
...
有人可以帮忙吗?谢谢。
答案 0 :(得分:1)
我不知道您为什么会想要这种格式,但是您可以像这样使用tidyr
来做到这一点:
gather
所有列分为姓和名的列; separate
通过将第二个age
拆分为其他两个-
; spread
取消总体值。library(tidyverse)
data <- structure(list(`2010 - Both Sexes - 0` = 163753, `2010 - Male - 0` = 83878, `2010 - Female - 0` = 79875, `2011 - Both Sexes - 0` = 161923, `2011 - Male - 0` = 83134, `2011 - Female - 0` = 78789, `2010 - Both Sexes - 1` = 163043, `2010 - Male - 1` = 83174, `2010 - Female - 1` = 79869, `2011 - Both Sexes - 1` = 163342, `2011 - Male - 1` = 83472, `2011 - Female - 1` = 79870), row.names = c(NA, -1L), class = c("tbl_df", "tbl", "data.frame"))
data %>%
gather(year_sex_age, population) %>%
separate(year_sex_age, c("year_sex", "age"), sep = " - (?=0|1)") %>%
spread(year_sex, population)
#> # A tibble: 2 x 7
#> age `2010 - Both Se~ `2010 - Female` `2010 - Male` `2011 - Both Se~
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 0 163753 79875 83878 161923
#> 2 1 163043 79869 83174 163342
#> # ... with 2 more variables: `2011 - Female` <dbl>, `2011 - Male` <dbl>
由reprex package(v0.2.0)于2018-08-01创建。