Question

我正在阅读来自期刊或论文的来源的回复，我将html响应作为字符串，如：

根据一些人的观点，梦想表达了“人格的深刻方面”（Foulkes 184），尽管其他人不同意。

我的目标是从给定字符串中提取所有引号，并将每个引号保存到列表中。我的方法是：

[match.start() for m in re.Matches(inputString, "\"([^\"]*)\""))]

不知何故，它对我不起作用。我的正则表达式有什么帮助吗？非常感谢。

Answer 1

如果没有嵌套引号：

re.findall(r'"([^"]*)"', inputString)

演示：

>>> import re
>>> inputString = 'According to some, dreams express "profound aspects of personality" (Foulkes 184), though others disagree.'
>>> re.findall(r'"([^"]*)"', inputString)
['profound aspects of personality']

Answer 2

如果您的输入可以包含以下内容，请使用此选项：some "text \" and text" more

s = '''According to some, dreams express "profound aspects of personality" (Foulkes 184), though others disagree.'''
lst = re.findall(r'"(.*?)(?<!\\)"', s)
print lst

使用(?<!\\)负面反馈，检查\

之前是否有"

在双引号之间提取字符串

2 个答案: