使用定界符更改R中的数据结构

时间:2018-10-26 16:21:02

标签: r

我的输入数据集是

df1 = data.frame(Var_A = c('A,B','C'),Var_B = c('1,2','2'))

所需的输出是

df2 = data.frame(Var_A = c('A','A','B','B','C'),Var_B = c('1','2','1', '2','2'))

请帮助

2 个答案:

答案 0 :(得分:1)

我们可以使用cSplit

library(splitstackshape)
library(dplyr)
cSplit(df1, "Var B", ",", "long") %>%
    cSplit(., "Var A", ",", "long")

或与separate_rows

library(tidyr)
separate_rows(df1, "Var B", convert = TRUE) %>%
      separate_rows("Var A") %>%
      arrange(`Var A`)
#   Var A Var B
#1     A     1
#2     A     2
#3     B     1
#4     B     2
#5     C     2

数据

df1 <- structure(list(`Var A` = c("A,B", "C"), `Var B` = c("1,2", "2"
 )), class = "data.frame", row.names = c(NA, -2L))

答案 1 :(得分:0)

我想出了一种base方法:

cut <- apply(df1, 1, function(x){
  expand.grid(strsplit(x, ","))
})

cut

# [[1]]
#   Var_A Var_B
# 1     A     1
# 2     B     1
# 3     A     2
# 4     B     2
#
# [[2]]
#   Var_A Var_B
# 1     C     2

Reduce(rbind, cut)

#   Var_A Var_B
# 1     A     1
# 2     B     1
# 3     A     2
# 4     B     2
# 5     C     2