如何获得以下两个文本中引号之间的内容?
text_1 = r""" "Some text on \"two\" lines with a backslash escaped\\" \
+ "Another text on \"three\" lines" """
text_2 = r""" "Some text on \"two\" lines with a backslash escaped\\" + "Another text on \"three\" lines" """
对我来说,问题是如果引用它们会被忽略,但是有可能使反斜杠转义。
我想获得以下群组。
[
r'Some text on \"two\" lines with a backslash escaped\\',
r'Another text on \"three\" lines'
]
答案 0 :(得分:16)
"(?:\\.|[^"\\])*"
匹配带引号的字符串,包括其中出现的任何转义字符。
<强>解释强>
" # Match a quote.
(?: # Either match...
\\. # an escaped character
| # or
[^"\\] # any character except quote or backslash.
)* # Repeat any number of times.
" # Match another quote.
答案 1 :(得分:1)
匹配除双引号之外的所有内容:
import re
text = "Some text on \"two\" lines" + "Another text on \"three\" lines"
print re.findall(r'"([^"]*)"', text)
<强>输出强>
['two', 'three']
答案 2 :(得分:0)
>>> import re
>>> text_1 = r""" "Some text on \"two\" lines with a backslash escaped\\" \
+ "Another text on \"three\" lines" """
>>> text_2 = r""" "Some text on \"two\" lines with a backslash escaped\\" + "Another text on \"three\" lines" """
>>> re.findall(r'\\"([^"]+)\\"', text_2)
['two', 'three']
>>> re.findall(r'\\"([^"]+)\\"', text_1)
['two', 'three']
也许你想要这个:
re.findall(r'\\"((?:(?<!\\)[^"])+)\\"', text)
答案 3 :(得分:0)
>>> import re
>>> text = "Some text on\n\"two\"lines" + "Another texton\n\"three\"\nlines"
>>> re.findall(r'"(.*)"', text)
["two", "three"]