如何自定义压平数据框?

时间:2017-08-23 21:41:30

标签: r

我的数据框如下:

df <- data.frame(x=c('a,b,c','d,e','f'),y=c(1,2,3))
df

> df
      x y
1 a,b,c 1
2   d,e 2
3     f 3

我可以像这样得到扁平的df$x

unique(unlist(strsplit(as.character(df$x), ",")))
[1] "a" "b" "c" "d" "e" "f"

将输入df转换为:

的最佳方式是什么?
 x y
 a 1
 b 1 
 c 1
 d 2
 e 2
 f 3

基本上展平df$x并单独指定其对应的y

3 个答案:

答案 0 :(得分:1)

sapply(unlist(strsplit(as.character(df$x), ",")), function(ss)
    df$y[which(grepl(pattern = ss, x = df$x))])
#a b c d e f 
#1 1 1 2 2 3 

如果您想要数据框

do.call(rbind, lapply(1:NROW(df), function(i)
    setNames(data.frame(unlist(strsplit(as.character(df$x[i]), ",")), df$y[i]),
             names(df))))
#  x y
#1 a 1
#2 b 1
#3 c 1
#4 d 2
#5 e 2
#6 f 3

答案 1 :(得分:1)

如果您正在使用data.frame,我建议您使用tidyr

df <- data.frame(x=c('a,b,c','d,e','f'),y=c(1,2,3),stringsAsFactors = F)
library(tidyr)
df %>%
    transform(x= strsplit(x, ",")) %>%
    unnest(x)


  y x
1 1 a
2 1 b
3 1 c
4 2 d
5 2 e
6 3 f

答案 2 :(得分:1)

FWIW,您还可以rep根据每个x值包含的元素数量来获取行索引:

df <- data.frame(x=c('a,b,c','d,e','f'),y=c(1,2,3),stringsAsFactors = F)
df[,1] <- strsplit(df[,1],",")
cbind(x=unlist(df[,1]),df[rep(1:nrow(df), lengths(df[,1])),-1,F])
#     x y
# 1   a 1
# 1.1 b 1
# 1.2 c 1
# 2   d 2
# 2.1 e 2
# 3   f 3