Question

我有一个字符串，其中包含用双引号引起来的单词或短语，并且需要在python中将其从引号中删除。示例：

该文本包含“单引号”和“逗号”。

文本带有“双引号”。

从引号中删除单词会导致：

文本中包含“”和“”。

文本中包含“”。

我使用RE re.finditer列出了所有找到的引号，但是我知道删除字符串中两个引号之间的单词会是怎样的。有人知道吗？

Answer 1

>> from re import sub 
>> s 
'The text has "single quotes" and "commas".' 
>> sub('".*?"', '" "',s)
'The text has " " and " ".'

Answer 2

有点复杂，但也许

(?<=")[^\s".][^"\r\n]*|[^"\r\n]*[^\s".](?=")

可能可以研究。

RegEx Demo

在某些极端情况下，这种模式可能会失败，您可能需要研究以下情况：

[^\s".]

测试

import re

string = '''
The text has "single quotes" and "commas".
The text has "double quotes"
"single quotes" and "commas"
"double quotes"
"d"
"d""d""d""d"

'''

expression = r'(?<=")[^\s".][^"\r\n]*|[^"\r\n]*[^\s".](?=")'

print(re.sub(expression, '', string))

输出

The text has "" and "".
The text has ""
"" and ""
""
""
""""""""

如果您希望简化/修改/探索表达式，请在regex101.com的右上角进行说明。如果愿意，您还可以在this link中查看它如何与某些示例输入匹配。

RegEx电路

jex.im可视化正则表达式：

Answer 3

看看这个简单的正则表达式：

"[\w\s]+"

Regex Demo

我们捕获" "之间的所有单词字符和可能的空格，然后替换为""：

expression = r'"[\w\s]+"'
print(re.sub(expression, '""', string))

Answer 4

您可以使用此代码。希望对您有所帮助。

text = 'The text has "single quotes" and "commas".'
text = re.sub('"[^"]*[$"]', '""', text)
print(text)  # The text has "" and "".

如何使用python从字符串中删除带引号的单词/词组？

4 个答案:

RegEx Demo

测试

输出

RegEx电路