将表格数据更改为R中的其他格式

时间:2018-01-15 01:11:56

标签: r

更改以下数据

pos BZ_SP   BZ_SP_m1    BZ_SP_m2    CL_SP   CL_SP_m1    CL_SP_m2
1   -300000 2   3   2540544 1   2
2   0   0   0   -118621 3   4

看起来像这样

CurveGroup  SpreadId    SpreadMonth1    SpreadMonth2    Position
BZ_SP   1   2   3   -300000
CL_SP   1   1   2   2540544
BZ_SP   2   0   0   0
CL_SP   2   3   4   -118621

1 个答案:

答案 0 :(得分:1)

gather将输入转换为长格式,然后将variable分隔为Curvegroupsuffixspread它退回到广泛的形式。重命名并重新排列列。

library(dplyr)
library(tidyr)

DF %>%
   gather(variable, value, -pos) %>% 
   separate(variable, c("CurveGroup", "suffix"), sep = 5, fill = "right") %>% 
   spread(suffix, value) %>%
   select(CurveGroup, SpreadId = "pos", SpreadMonth1 = "_m1", SpreadMonth2 = "_m2", 
     Position = "V1")

,并提供:

  CurveGroup SpreadId SpreadMonth1 SpreadMonth2 Position
1      BZ_SP        1            2            3  -300000
2      CL_SP        1            1            2  2540544
3      BZ_SP        2            0            0        0
4      CL_SP        2            3            4  -118621

注意:可重复形式的输入DF为:

DF <- structure(list(pos = 1:2, BZ_SP = c(-300000L, 0L), BZ_SP_m1 = c(2L, 
  0L), BZ_SP_m2 = c(3L, 0L), CL_SP = c(2540544L, -118621L), CL_SP_m1 = c(1L, 
  3L), CL_SP_m2 = c(2L, 4L)), .Names = c("pos", "BZ_SP", "BZ_SP_m1", 
  "BZ_SP_m2", "CL_SP", "CL_SP_m1", "CL_SP_m2"), 
  class = "data.frame", row.names = c(NA, -2L))

更新:简化。