根据R中的列合并数据帧

时间:2014-11-19 13:31:18

标签: r dataframe transform rows multiple-columns

您好我有以下数据框:

CSS_WEEK_END_DATE   BV  BVG  BVH BVG1 BVG2 BVG3 BVG4 BVH1 BVH2 BVH3 BVH4 BVH5 BVG11 BVG12 BVG13 BVG14 BVG15 BVG16 BVG21 BVG22 BVG23 BVG24 BVG25 BVG31 BVG32 BVG34
1        2012-01-13 28.0 28.3 27.7 28.6 28.7 27.3 28.7 29.5 27.2 26.5 27.8 27.5  34.3  30.7  29.8  25.9  25.7  28.0  29.9  33.9  26.2  32.0  24.4  29.3  24.0  26.9
2        2012-01-20 28.8 29.5 28.4 31.8 29.2 28.2 28.9 30.0 27.8 27.4 28.0 28.7  37.9  34.3  33.3  30.7  27.1  31.6  28.6  32.6  29.3  32.8  24.5  31.5  24.0  27.8
3        2012-01-27 28.2 28.6 27.9 30.7 28.4 27.4 28.0 29.7 27.5 26.9 27.4 28.0  34.8  29.3  32.8  29.5  28.3  31.3  26.9  33.2  27.0  31.3  25.4  30.8  23.4  25.9
4        2012-02-03 28.1 28.2 28.1 30.6 27.6 27.0 27.8 30.5 27.5 25.9 27.5 28.9  37.9  29.1  31.7  30.0  26.8  31.9  26.4  32.3  26.6  31.3  23.5  30.5  21.7  26.3
5        2012-02-10 27.9 28.1 27.7 30.5 27.9 27.0 27.5 30.4 26.8 26.2 26.4 28.5  35.9  30.0  30.8  30.5  25.8  32.9  26.1  32.1  26.2  31.4  25.2  30.6  24.1  25.6
6        2012-02-17 26.4 26.8 26.0 30.6 26.8 24.0 26.5 29.1 25.6 24.3 24.8 26.4  36.1  31.5  29.6  30.0  25.5  33.1  25.2  32.6  24.8  30.3  23.1  26.2  20.3  22.5

我想将此数据集转换为:

CSS_WEEK_END_DATE   patch   Drive.Per.Task
13/01/2012              BV  28
20/01/2012              BV  28.8
27/01/2012              BV  28.2
03/02/2012              BV  28.1
10/02/2012              BV  27.9
13/01/2012              BVG 28.3
20/01/2012              BVG 29.5
27/01/2012              BVG 28.6
03/02/2012              BVG 28.2
10/02/2012              BVG 28.1
13/01/2012              BVH 27.7
20/01/2012              BVH 28.4
27/01/2012              BVH 27.9
03/02/2012              BVH 28.1
10/02/2012              BVH 27.7
...

如何在R中以最佳方式进行操作。我目前不得不在Excel中复制和粘贴这是一项痛苦的任务,因为我有很多补丁,如上例所示

2 个答案:

答案 0 :(得分:4)

如果您的数据框名称是DF,您可以尝试这样的事情:

library(reshape2)
melt(DF, id='CSS_WEEK_END_DATE', value.name='Drive.Per.Task', variable.name='patch')

如果您没有图书馆reshape2,可以像这样安装:

install.packages('reshape2')

如果有帮助,请告诉我。

答案 1 :(得分:2)

这是“reshape2”包的替代方案(由同一个包作者提供):

library(dplyr)
library(tidyr)
new <- mydf %>% gather(patch, Drive.Per.Task, BV:BVG34)
head(new)
#   CSS_WEEK_END_DATE patch Drive.Per.Task
# 1        2012-01-13    BV           28.0
# 2        2012-01-20    BV           28.8
# 3        2012-01-27    BV           28.2
# 4        2012-02-03    BV           28.1
# 5        2012-02-10    BV           27.9
# 6        2012-02-17    BV           26.4

在基数R中,您可以尝试stack

out <- cbind(CSS_WEEK_END_DATE = mydf[[1]], 
             setNames(stack(mydf[-1]), 
                      c("Drive.Per.Task", "patch")))
head(out)
#   CSS_WEEK_END_DATE Drive.Per.Task patch
# 1        2012-01-13           28.0    BV
# 2        2012-01-20           28.8    BV
# 3        2012-01-27           28.2    BV
# 4        2012-02-03           28.1    BV
# 5        2012-02-10           27.9    BV
# 6        2012-02-17           26.4    BV