使用多行重新整理从宽到长的数据

时间:2016-02-14 17:28:22

标签: r reshape

我有一个我想重塑的数据集dfs

dfs
#      country.name                     indicator.name         x1990         x1991         x1992
# 507       andorra GDP at market prices (current US$)  1.028989e+09  1.106891e+09  1.209993e+09
# 510       andorra              GDP growth (annual %)  3.781393e+00  2.546001e+00  9.292154e-01
# 1347      albania GDP at market prices (current US$)  2.101625e+09  1.139167e+09  7.094526e+08
# 1350      albania              GDP growth (annual %) -9.575640e+00 -2.958900e+01 -7.200000e+00
# 3587      austria GDP at market prices (current US$)  1.660624e+11  1.733755e+11  1.946082e+11

我希望它能使指标名称为列,时间在一列中并带有指示符。

#   country time   gdp_market gdp_growth
# 1 andorra 1990   1028989394  3.7813935
# 2 andorra 1990   1106891025  2.5460006
# 3 andorra 1990   1209992650  0.9292154
# 4 albania 1991   2101624963  3.7813935
# 5 albania 1991   1139166646  2.5460006
# 6 albania 1991    709452584  0.9292154
# 7 austria 1992 166062376740         NA
# 8 austria 1992 173375508073         NA
# 9 austria 1992 194608183696         NA

我可以将数据重新整形为长格式,但不能将其分成两列

library(reshape2)
melt.dfs <- melt(dfs, id=1:2)

我可以做一个分裂和cbind,但是id更喜欢用reshape来做。感谢

dfs = structure(list(country.name = c("andorra", "andorra", "albania", 
"albania", "austria"), indicator.name = c("GDP at market prices (current US$)", 
"GDP growth (annual %)", "GDP at market prices (current US$)", 
"GDP growth (annual %)", "GDP at market prices (current US$)"
), x1990 = c(1028989393.70295, 3.78139347786568, 2101624962.5, 
-9.57564018741695, 166062376739.683), x1991 = c(1106891024.78653, 
2.54600064090229, 1139166645.83333, -29.5889976817695, 173375508073.07
), x1992 = c(1209992649.56688, 0.929215382801402, 709452583.880319, 
-7.19999998650893, 194608183696.469)), .Names = c("country.name", 
"indicator.name", "x1990", "x1991", "x1992"), row.names = c(507L, 
510L, 1347L, 1350L, 3587L), class = "data.frame")

1 个答案:

答案 0 :(得分:1)

我们可以使用

library(dplyr)
library(tidyr)
gather(dfs, time, Val, x1990:x1992) %>% 
       spread(indicator.name, Val)

编辑:基于@docendo discimus的评论

或使用recast

library(reshape2)
recast(dfs, measure = 3:5, ...~indicator.name, value.var='value')