我在R中有一个数据框如下:
D = data.frame(countrycode = c(2, 2, 2, 3, 3, 3),
year = c(1980, 1991, 2013, 1980, 1991, 2013),
hello = c("A", "B", "C", "D", "E", "F"),
world = c("Z", "Y", "X", "NA", "Q", "NA"),
foo = c("Yes", "No", "NA", "NA", "Yes", "NA"))
我希望将hello
,world
和foo
列合并到一个列中,并按countrycode
和year
编制索引,如下所示:
output<-data.frame(countrycode=c(2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3),
year=c(1980,1980,1980,1991,1991,1991,2013,2013,2013,1980,1980,1980,1991,1991,1991,2013,2013,2013),
Combined=c("A","Z","Yes","B","Y","No","C","X","NA","D","NA","NA","E","Q","Yes","F","NA","NA"))
我已尝试使用标准R中的cbind
和来自gather
包的tidyr
,但似乎都不起作用。
答案 0 :(得分:6)
我认为你正在寻找包reshape2。请尝试以下代码:
library(reshape2)
output<-melt(D,id.vars=c("countrycode","year"))
output<-output[order(output$countrycode,output$year),]
它再现了你的例子。两个函数非常有用:融合和相反:dcast。
答案 1 :(得分:2)
reshape2
和dplyr
单行:
library(reshape2)
library(dplyr)
converted = melt(D,
measure.vars=c("hello","world","foo"),
value.name="Combined") %>%
arrange(countrycode, year) %>% select(-variable)
> converted
countrycode year Combined
1 2 1980 A
2 2 1980 Z
3 2 1980 Yes
4 2 1991 B
5 2 1991 Y
6 2 1991 No
等。这也会以与样本输出相同的列和列名称结束。
答案 2 :(得分:1)
使用tidyr
和dplyr
,这看起来像
library(dplyr)
library(tidyr)
D %>% gather(var, Combined, hello:foo) %>% arrange(countrycode, year)
# countrycode year var Combined
# 1 2 1980 hello A
# 2 2 1980 world Z
# 3 2 1980 foo Yes
# 4 2 1991 hello B
# 5 2 1991 world Y
# 6 2 1991 foo No
# . . ... ... ...
我离开了关键列,因为没有它会丢失数据,但是如果你真的不想要它,请点击%>% select(-var)
。