Regex extract info between two comma

时间:2016-07-11 20:45:07

标签: regex r gsub

data<-data.frame(x=c("a,b","c","a,b","d,e,f,g"))
        x
1     a,b
2       c
3     a,b
4 d,e,f,g

I would like to extract info from column x and write every unique info into column y, what should I do? Thank you! Col y is expected like:

  y
1 a
2 b
3 c
4 d
5 e
6 f
7 g

2 个答案:

答案 0 :(得分:1)

d<-data.frame(x=c("a,b","c","a,b","d,e,f,g"))

> levels(d$x)
[1] "a,b"     "c"       "d,e,f,g"

> e <- as.character(levels(d$x))
> e
[1] "a,b"     "c"       "d,e,f,g"
> 

> f <- strsplit(e,",")
> f
[[1]]
[1] "a" "b"

[[2]]
[1] "c"

[[3]]
[1] "d" "e" "f" "g"

unlist(f)
[1] "a" "b" "c" "d" "e" "f" "g"

答案 1 :(得分:1)

A tidyr solution:

library(tidyr)
data %>% unnest(x=strsplit(as.character(x),",")) %>% unique()

or (thanks to @alistaire)

data %>% separate_rows(x) %>% unique()