转换列表输出以在dplyr管道中创建data.frame

时间:2017-01-14 16:15:15

标签: r regex dplyr tidyr

我正在使用正则表达式提取模式并使用dplyr构建data.frame

library (dplyr)
library (stringr)

Target <- c("@user1 lorem ipsum @user2", "@user3 lorem ipsum @user4")
Source <-  c(" lorem ipsum", "dolores")
dataset <- data.frame(Source, Target)


dataset2 <- dataset %>%
mutate (Target=str_extract_all(v1, "@\\w+"))

我的结果(data.frame):

lorem ipsum c("@user1", "@user2")
dolores     c("@user3", "@user4")

我想要的 data.frame 对象:

lorem ipsum  "@user1"
lorem ipsum  "@user2"
dolores      "@user3"
dolores      "@user4"

1 个答案:

答案 0 :(得分:1)

我们可以尝试

stack(setNames(str_extract_all(dataset$Target, "@\\w+"), dataset$Source))[2:1]
#          ind values    
#1  lorem ipsum @user1
#2  lorem ipsum @user2
#3      dolores @user3
#4      dolores @user4

或者我们可以使用unnest

中的tidyr
library(dplyr)
library(tidyr)
dataset %>% 
      mutate(Target = str_extract_all(Target, "@\\w+")) %>%
      unnest
#        Source Target
#1  lorem ipsum @user1
#2  lorem ipsum @user2
#3      dolores @user3
#4      dolores @user4