列到行但保留常用列名

时间:2016-08-09 15:35:34

标签: r dplyr

我有一个如下所示的数据框:

data.frame(group1_a_mu = 10, group1_b_sd = 4, group1_c_xx = 5, group2_a_mu=1, group2_b_sd=2, gorup2_c_xx = 14, stringsAsFactors = FALSE)

 group1_a_mu group1_b_sd group1_c_xx group2_a_mu group2_b_sd gorup2_c_xx
1       10        4        5        1        2       14

我想将其改为:

            mu    sd     xx
group1     10    4     5
group2      1     2    14

怎么能这样做?

1 个答案:

答案 0 :(得分:1)

您可以尝试以下内容(基于原始帖子中的数据):

library(dplyr)
library(tidyr)
data.frame(group1_a = 10, group1_b = 4, group1_c = 5, group2_a=1, group2_b=2, group2_c = 14, stringsAsFactors = FALSE) %>%
    gather(key, val) %>%
    separate(key, c('group_name', 'subgroup_name'), sep = '_') %>%
    spread(subgroup_name, val)

##   group_name  a b  c
## 1     group1 10 4  5
## 2     group2  1 2 14

对于有2个_字符(更新后的帖子)的情况,以下方法会临时修改_字符。另一种方法是在separate正则表达式(sep)中使用前瞻或后面的运算符。

data.frame(group1_a_mu = 10, group1_b_sd = 4, group1_c_xx = 5, group2_a_mu=1, group2_b_sd=2, group2_c_xx = 14, stringsAsFactors = FALSE) %>%
    gather(key, val) %>%
    mutate(key = sub('_', '|', key)) %>%              ## Temporary change of '_' to '|'
    separate(key, c('group_name', 'subgroup_name'), sep = '_') %>%
    spread(subgroup_name, val) %>%
    mutate(group_name = sub('[|]', '_', group_name))  ## Change back to '_'

##   group_name mu sd xx
## 1   group1_a 10 NA NA
## 2   group1_b NA  4 NA
## 3   group1_c NA NA  5
## 4   group2_a  1 NA NA
## 5   group2_b NA  2 NA
## 6   group2_c NA NA 14

使用positive look behind运算符会得到相同的结果。

data.frame(group1_a_mu = 10, group1_b_sd = 4, group1_c_xx = 5, group2_a_mu=1, group2_b_sd=2, group2_c_xx = 14, stringsAsFactors = FALSE) %>%
    gather(key, val) %>%
    separate(key, c('group_name', 'subgroup_name'), sep = '(?<=[a-z])_') %>%
    spread(subgroup_name, val)