正则表达式。如何在标点符号前一段时间后删除空格

时间:2016-12-21 14:00:10

标签: r regex

我对正则表达式有疑问。假设我有这个字符串

"She gained about 55 pounds in...9 months. She was like an eating machine. ”Trump, a man who wants to be president: "

我希望在句点之后和字符“删除字符”之前删除每个空格

例如句子的这一部分

She was like an eating machine. ”Trump, a man who wants to be president: 

应该成为

She was like an eating machine.Trump, a man who wants to be president: "

谢谢大家,正则表达式并不容易学习。感谢任何帮助!再见 我正在使用软件R,但我认为这是无关紧要的,因为正则表达式适用于所有编程语言

更新

我解决了我的问题,我想分享它,也许可以帮助其他人。我从kaggle下载了关于特朗普和希拉里推特的数据集。

在导入Knime(大学项目)数据之前,我必须做一些清洁工作。 我通过gsub解决了所有编码问题,除此之外。我终于设法解决它使用编码UTF-8在R中编写csv文件。显然,我在Knime中使用相同的编码

读取该文件

3 个答案:

答案 0 :(得分:4)

如果您需要在点和卷曲双引号之间匹配任意数量的空格(1或更多),您可以使用

x <- "She gained about 55 pounds in...9 months. She was like an eating machine. ”Trump, a man who wants to be president: "
gsub("\\.\\s+”", ".", x)
## => [1] "She gained about 55 pounds in...9 months. She was like an eating machine.Trump, a man who wants to be president: "

\\.匹配一个点,\\s+匹配一个或多个空白符号,匹配

请参阅regex demoR demo

如果点和引号之间只有1个常规空格,则可以使用固定的字符串替换:

gsub(". ”", ".", x, fixed=TRUE)

请参阅this R demo

答案 1 :(得分:1)

可能会有所帮助:

var str = 'She was like an eating machine. "Trump, a man who wants to be president. "New value'; 
str.replace(/\.\s"/g,".");

答案 2 :(得分:0)

http://regexr.com/是学习和测试正则表达式的绝佳工具。

我唯一能够添加到Wiktor的答案是它不匹配"machine.”Trump"。要匹配点后和引号之前的任意数量的空格,请使用*量词:

x <- "She gained about 55 pounds in...9 months. She was like an eating machine. ”Trump, a man who wants to be president: "
gsub("\\.\\s*”", ".", x)