如何组合R

时间:2016-05-17 16:29:25

标签: r

我在R中有一个数据框如下:

D = data.frame(countrycode = c(2, 2, 2, 3, 3, 3), 
           year = c(1980, 1991, 2013, 1980, 1991, 2013), 
           hello = c("A", "B", "C", "D", "E", "F"), 
           world = c("Z", "Y", "X", "NA", "Q", "NA"), 
           foo = c("Yes", "No", "NA", "NA", "Yes", "NA"))

我希望将helloworldfoo列合并到一个列中,并按countrycodeyear编制索引,如下所示:

output<-data.frame(countrycode=c(2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3),
    year=c(1980,1980,1980,1991,1991,1991,2013,2013,2013,1980,1980,1980,1991,1991,1991,2013,2013,2013),
    Combined=c("A","Z","Yes","B","Y","No","C","X","NA","D","NA","NA","E","Q","Yes","F","NA","NA"))

我已尝试使用标准R中的cbind和来自gather包的tidyr,但似乎都不起作用。

3 个答案:

答案 0 :(得分:6)

我认为你正在寻找包reshape2。请尝试以下代码:

library(reshape2)

output<-melt(D,id.vars=c("countrycode","year"))
output<-output[order(output$countrycode,output$year),]

它再现了你的例子。两个函数非常有用:融合和相反:dcast。

答案 1 :(得分:2)

reshape2dplyr单行:

library(reshape2)
library(dplyr)
converted = melt(D,
  measure.vars=c("hello","world","foo"),
  value.name="Combined") %>%
    arrange(countrycode, year) %>% select(-variable)

> converted
   countrycode year Combined
1            2 1980        A
2            2 1980        Z
3            2 1980      Yes
4            2 1991        B
5            2 1991        Y
6            2 1991       No

等。这也会以与样本输出相同的列和列名称结束。

答案 2 :(得分:1)

使用tidyrdplyr,这看起来像

library(dplyr)
library(tidyr)

D %>% gather(var, Combined, hello:foo) %>% arrange(countrycode, year)
#    countrycode year   var Combined
# 1            2 1980 hello        A
# 2            2 1980 world        Z
# 3            2 1980   foo      Yes
# 4            2 1991 hello        B
# 5            2 1991 world        Y
# 6            2 1991   foo       No
# .            .  ...   ...      ...

我离开了关键列,因为没有它会丢失数据,但是如果你真的不想要它,请点击%>% select(-var)