如何使用gsub()删除复杂的字符串模式

时间:2014-06-20 20:52:27

标签: r gsub

我有一个数据表如下:

| _ - 9 | PR - Very happy with results. | Improvement - Be more clear regarding how the entire process works. I.e. how long you have to wait for your account to become active. | Churn Reason - none"  

如果有的话,我尝试从每一行删除| Improvement...|。我把它写成

feedback <- gsub("| Improvement*|", "",data$Feedback,  ignore.case = FALSE, perl = TRUE)

但它什么都没做。有人可以帮我这个吗?

1 个答案:

答案 0 :(得分:2)

您需要转义管道字符|,因为它们被解释为OR。您还需要. *量词来申请。最后,虽然这个例子并不重要,但您可能需要非贪婪的*版本,这样您就不会收集多个字段的内容。

Feedback = "| _ - 9 | PR - Very happy with results. | Improvement - Be more clear regarding how the entire process works. I.e. how long you have to wait for your account to become active. | Churn Reason - none"  
feedback <- gsub("\\| Improvement.*?\\|", "", Feedback,  ignore.case = FALSE, perl = TRUE)
print(feedback)

输出:

"| _ - 9 | PR - Very happy with results.  Churn Reason - none"