REGEX-String和转义报价

时间:2013-04-21 10:56:19

标签: python regex

如何获得以下两个文本中引号之间的内容?

text_1 = r""" "Some text on \"two\" lines with a backslash escaped\\" \
     + "Another text on \"three\" lines" """

text_2 = r""" "Some text on \"two\" lines with a backslash escaped\\" + "Another text on \"three\" lines" """

对我来说,问题是如果引用它们会被忽略,但是有可能使反斜杠转义。

我想获得以下群组。

[
    r'Some text on \"two\" lines with a backslash escaped\\',
    r'Another text on \"three\" lines'
]

4 个答案:

答案 0 :(得分:16)

"(?:\\.|[^"\\])*"

匹配带引号的字符串,包括其中出现的任何转义字符。

<强>解释

"       # Match a quote.
(?:     # Either match...
 \\.    # an escaped character
|       # or
 [^"\\] # any character except quote or backslash.
)*      # Repeat any number of times.
"       # Match another quote.

答案 1 :(得分:1)

匹配除双引号之外的所有内容:

import re
text = "Some text on \"two\" lines" + "Another text on \"three\" lines"
print re.findall(r'"([^"]*)"', text)

<强>输出

['two', 'three']

答案 2 :(得分:0)

>>> import re
>>> text_1 = r""" "Some text on \"two\" lines with a backslash escaped\\" \
     + "Another text on \"three\" lines" """
>>> text_2 = r""" "Some text on \"two\" lines with a backslash escaped\\" + "Another text on \"three\" lines" """
>>> re.findall(r'\\"([^"]+)\\"', text_2)
['two', 'three']
>>> re.findall(r'\\"([^"]+)\\"', text_1)
['two', 'three']

也许你想要这个:

re.findall(r'\\"((?:(?<!\\)[^"])+)\\"', text)

答案 3 :(得分:0)

>>> import re
>>> text = "Some text on\n\"two\"lines" + "Another texton\n\"three\"\nlines"
>>> re.findall(r'"(.*)"', text)
["two", "three"]