我的数据如下:
x = "Unable to load the file //xxxx/yyy/abc.pdf onto the RAM"
我需要在“文件”和“ onto”之间隔开数据,并需要输出类似这样的内容
"Unable to load the file onto the RAM"
我尝试了rm_between
软件包中的qdapRegex
选项,但是当我尝试这样的操作时,这甚至会删除单词“ file”和“ onto”:
rm_between(x,"file","onto",replacement = "")
我找不到其他保留边界词的选项。
答案 0 :(得分:4)
正则表达式(regex)和基本R函数gsub()
可以完成此工作:
gsub("(?<=file).*(?=onto)", " ", x, perl = TRUE)
[1] "Unable to load the file onto the RAM"
我们使用的正则表达式技巧是积极 先行和后向。
替代方法:
gsub("(file).*(onto)", "\\1 \\2", x, perl = TRUE)
[1] "Unable to load the file onto the RAM"
要继续使用您一直使用的功能,一个简单的技巧是:
qdapRegex::rm_between(x, "file", "onto", replacement = "file onto")
[1] "Unable to load the file onto the RAM"
看看文档,还有一个论点就是不删除边界(标记),这导致了最简单的解决方案:
qdapRegex::rm_between(x, "file", "onto", replacement = " ", include.markers = FALSE)
[1] "Unable to load the file onto the RAM"