我反对他们的同义词。在不同的数据框中,我有句子。我想从其他数据框中搜索同义词。如果找到,将其替换为找到的Synomym单词。
dt = read.table(header = TRUE,
text ="Word Synonyms
Use 'employ, utilize, exhaust, spend, expend, consume, exercise'
Come 'advance, approach, arrive, near, reach'
Go 'depart, disappear, fade, move, proceed, recede, travel'
Run 'dash, escape, elope, flee, hasten, hurry, race, rush, speed, sprint'
Hurry 'rush, run, speed, race, hasten, urge, accelerate, bustle'
Hide 'conceal, cover, mask, cloak, camouflage, screen, shroud, veil'
", stringsAsFactors= F)
mydf = read.table(header = TRUE, , stringsAsFactors= F,
text ="sentence
'I can utilize this file'
'I can cover these things'
")
所需的输出看起来像-
I can Use this file
I can Hide these things
以上只是一个示例。在我的真实数据集中,我有超过10000个句子。我正在使用下面的功能。太慢了有什么有效的方法吗?
dt$Synonyms <- paste("\\b",gsub(", ","\\\\b|\\\\b",tolower(dt$Synonyms)),"\\b", sep = "")
# Loop through each row of 'dt' to replace Synonyms with word using sapply
mydf$sentence <- sapply(tolower(mydf$sentence), function(x){
for(row in 1:nrow(dt)){
x = gsub(dt$Synonyms[row],dt$Word[row], x)
}
x
})
mydf
答案 0 :(得分:0)
在此尝试:
数据:
dt = read.table(header = TRUE,
text ="Word Synonyms
Use 'employ, utilize, exhaust, spend, expend, consume, exercise'
Come 'advance, approach, arrive, near, reach'
Go 'depart, disappear, fade, move, proceed, recede, travel'
Run 'dash, escape, elope, flee, hasten, hurry, race, rush, speed, sprint'
Hurry 'rush, run, speed, race, hasten, urge, accelerate, bustle'
Hide 'conceal, cover, mask, cloak, camouflage, screen, shroud, veil'
", stringsAsFactors= F)
mydf = read.table(header = TRUE, , stringsAsFactors= F,
text ="sentence
'I can utilize this file'
'I can cover these things'
'I can advance and cover also rush things'
")
代码:
dt$Synonyms <- gsub(",\\s+","|",dt$Synonyms)
fun1 <- function(sentence){
sentence <- sentence
mapply(function(word,synon){sentence <<- gsub(synon, word, sentence)},word=dt$Word,synon=dt$Synonyms)
return(sentence)
}
mydf$sentence %>% lapply(fun1)
结果:
[[1]]
[1] "I can Use this file"
[[2]]
[1] "I can Hide these things"
[[3]]
[1] "I can Come and Hide also Run things"
请注意:
如果您要保留对mydf的更改,请将%>%
替换为%<>%
。
mydf$sentence %<>% lapply(fun1)