我想重构我的数据,这些数据看起来像左边的表格,我想要转换为。
左表中的上述样本数据编码如下。
data <- data.frame(country=c('US', 'US', 'US', 'US', 'US', 'UO', 'UO', 'UO', 'UO', 'UO'),
year=c(2015, 2015, 2016, 2016, 2016, 2015, 2015, 2015, 2016, 2016),
region=c('NY', 'CA', 'MI', 'MA', 'IL', 'GH', 'FD', 'AH', 'PO', 'LQ'))
感谢您的建议。
答案 0 :(得分:1)
使用dplyr和purrr包解决此问题
#collection of many packagaes like tibble,dplyr,purrr,ggplot and etc.
library(tidyverse)
data %>%
#First two columns contain grouping variables
group_by(year,country) %>%
#based on grouping variables, make other columns into a list and store
#them in an other column
nest() %>%
#map is from purrr package, which can deal with list in a dataframe
# t() transpose the dataframe, as.tibble() make them into tibble class
mutate(data = map(data,~ as.tibble(t(.x)))) %>%
#unnest the list column
unnest() %>%
#rename columns
rename(region1 = V1, region2 = V2, region3 = V3)
结果:
# A tibble: 4 x 5
year country region1 region2 region3
<dbl> <fct> <chr> <chr> <chr>
1 2015 US NY CA NA
2 2016 US MI MA IL
3 2015 UO GH FD AH
4 2016 UO PO LQ NA
如果需要,用空字符串“”替换NA。
或者以这种方式重命名:
library(stringr)
colnames(temp) <- str_replace(colnames(temp),pattern = fixed("V"),replacement = "region")
colnames(temp)
#result
[1] "year" "country" "region1" "region2" "region3"