朋友们,我正在尝试在文件列表中搜索特定关键字(在txt中给出)。我正在使用正则表达式来检测并替换文件中关键字的出现。 下面是我传递给搜索的逗号分隔的关键字。
library(stringi)
txt <- "automatically got activated,may be we download,network services,food quality is excellent"
Ex&#34;自动激活&#34;应该通过automatic_got_activated来搜索和替换...&#34;我们可能会下载&#34;替换为&#34; may_be_we_download&#34;等等。
txt <- "automatically got activated,may be we download,network services,food quality is excellent"
for(i in 1:length(txt)) {
start <- head(strsplit(txt, split=" ")[[i]], 1) #finding the first word of the keyword
n <- stri_stats_latex(txt[i])[4] #number of words in the keyword
o <- tolower(regmatches(text, regexpr(paste0(start,"(?:[^a-zA-Z'-]+[a-zA-Z'-]+){0,",
n-1,"}"),text,ignore.case=TRUE))) #best match for keyword for the regex in the file
p <- which(!is.na(pmatch(txt, o))) #exact match for the keywords
}
答案 0 :(得分:1)
我认为这可能就是你要找的东西。
> txt <- "automatically got activated,may be we download,network services,food quality is excellent"
要搜索的句子的组成向量:
> searchList <- c('This is a sentence that automatically got activated',
'may be we download some music tonight',
'I work in network services',
'food quality is excellent every time I go',
'New service entrance',
'full quantity is excellent')
完成工作的功能:
replace.keyword <- function(text, toSearch)
{
kw <- unlist(strsplit(txt, ','))
gs <- gsub('\\s', '_', kw)
sapply(seq(kw), function(i){
ul <- ifelse(grepl(kw[i], toSearch),
gsub(kw[i], gs[i], toSearch),
"")
ul[nzchar(ul)]
})
}
结果:
> replace.keyword(txt, searchList)
# [1] "This is a sentence that automatically_got_activated"
# [2] "may_be_we_download some music tonight"
# [3] "I work in network_services"
# [4] "food_quality_is_excellent every time I go"
让我知道它是否适合你。