我已经搜索了我的问题的具体答案,但没有成功。
首先,我有一个由48个变量组成的数据框,看起来像这样:
> df
Text Screen_Name ...
1 a text where @Sam and @Su and @Jim are addressed Peter
2 a text where @Eric is addressed Margret
3 a text where @Sarah and @Adam are addressed John
现在我提取所有相等的字符串(" @ \ S +")并将它们存储在新列中
df$addressees <- str_extract_all(df$text, "@\\S+")
这让我:
... Screen_Name Addressees ...
1 Peter c("@Sam", "@Su", "@Jim")
2 Margret @Eric
3 John c("@Sarah", "@Adam")
现在我想为两列创建一个新的数据框,其中每列都有新的行&#34;收件人&#34;通过重复列#34; Screen_Name&#34;:
的相应值来创建> df
Screen_Name Addressees
1 Peter Sam
2 Peter Su
3 Peter Jim
4 Margret Eric
5 John Sarah
6 John Adam
我尝试过类似方法的解决方案,但似乎都没有。
非常感谢你的帮助!
答案 0 :(得分:3)
好的,有一个可重复的例子:
# create df
ego <- c("peter","margaret","john")
friends <- list(c("sam","su","jim"),c("eric"),c("sarah","adam"))
df <- data.frame(ego,friends= I(friends),stringsAsFactors = F)
# use repeat function to repeat rows
times <- sapply(df$friends,length)
df <- df[rep(seq_len(nrow(df)), times),]
# assign back unlisted friends
df$friends <- unlist(friends)
答案 1 :(得分:3)
您也可以使用@raistlin创建的data.table
尝试df
:
library(data.table)
setDT(df)[, .(friends = unlist(friends)), by = "ego"]
ego friends
1: peter sam
2: peter su
3: peter jim
4: margaret eric
5: john sarah
6: john adam
现在,通过OP 提供的附加上下文,可以简化data.table
解决方案以解决单行中的潜在问题。
要根据OP的请求移除@
列中的前导Addressees
,需要修改正则表达式以使用positive lookbehind。
library(data.table)
# read data (to make it a reproducible example)
dt <- fread("Text; Screen_Name
a text where @Sam and @Su and @Jim are addressed; Peter
a text where @Eric is addressed; Margret
a text where @Sarah and @Adam are addressed; John")
# use str_extract_all with modified regex
dt[, .(Addressees = unlist(stringr::str_extract_all(Text, "(?<=@)\\S+"))),
by = .(Screen_Name)]
# Screen_Name Addressees
#1: Peter Sam
#2: Peter Su
#3: Peter Jim
#4: Margret Eric
#5: John Sarah
#6: John Adam
答案 2 :(得分:0)
这有帮助吗?
输入:
Screen_Name <- c("Peter", "Margaret", "John")
Addressees <- c(c("@Sam", "@Su", "@Jim"), "@Eric", c("@Sarah", "@Adam") )
tidyverse
方式:
df <- data.frame(Screen_Name, Addressees) %>%
tidyr::expand(Screen_Name, Addressees)