我有以下一句话
review <- C("1a. How long did it take for you to receive a personalized response to an internet or email inquiry made to THIS dealership?: Approx. It was very prompt however. 2f. Consideration of your time and responsiveness to your requests.: Were a little bit pushy but excellent otherwise 2g. Your satisfaction with the process of coming to an agreement on pricing.: Were willing to try to bring the price to a level that was acceptable to me. Please provide any additional comments regarding your recent sales experience.: Abel is awesome! Took care of everything from welcoming me into the dealership to making sure I got the car I wanted (even the color)! ")
我想删除之前的所有内容:
我尝试了以下代码
gsub("^[^:]+:","",review)
但是,它只删除了以冒号结尾的第一句话
预期结果:
Approx. It was very prompt however. Were a little bit pushy but excellent otherwise Were willing to try to bring the price to a level that was acceptable to me. Abel is awesome! Took care of everything from welcoming me into the dealership to making sure I got the car I wanted (even the color)!
任何帮助或建议将不胜感激。谢谢。
答案 0 :(得分:2)
如果句子不复杂且没有缩写,则可以使用
gsub("(?:\\d+[a-zA-Z]\\.)?[^.?!:]*[?!.]:\\s*", "", review)
请参见regex demo。
请注意,您可以通过将\\d+[a-zA-Z]
更改为[0-9a-zA-Z]+
/ [[:alnum:]]+
以匹配1个以上的数字或字母来进一步概括一下。
详细信息
(?:\d+[a-zA-Z]\.)?
-的可选序列
\d+
-1个以上数字[a-zA-Z]
-ASCII字母\.
-一个点[^.?!:]*
-除.
,?
,!
,:
[?!.]
-a ?
,!
或.
:
-冒号\s*
-超过0个空格R测试:
> gsub("(?:\\d+[a-zA-Z]\\.)?[^.?!:]*[?!.]:\\s*", "", review)
[1] "Approx. It was very prompt however. Were a little bit pushy but excellent otherwise Were willing to try to bring the price to a level that was acceptable to me.Abel is awesome! Took care of everything from welcoming me into the dealership to making sure I got the car I wanted (even the color)! "
扩展为缩写
如果添加轮换,则可以列举例外情况:
gsub("(?:\\d+[a-zA-Z]\\.)?(?:i\\.?e\\.|[^.?!:])*[?!.]:\\s*", "", review)
^^^^^^^^^^^^^^^^^^^^^^
在这里,(?:i\.?e\.|[^.?!:])*
匹配0个或多个ie.
或i.e.
子字符串或.
,?
,!
或{ {1}}。
请参见this demo。