基本上我有一个带有这个的txt文档,
The sound of a horse at a gallop came fast and furiously up the hill.
"So-ho!" the guard sang out, as loud as he could roar.
"Yo there! Stand! I shall fire!"
The pace was suddenly checked, and, with much splashing and floundering, a man's voice called from the mist, "Is that the Dover mail?"
"Never you mind what it is!" the guard retorted. "What are you?"
"_Is_ that the Dover mail?"
"Why do you want to know?"
"I want a passenger, if it is."
"What passenger?"
"Mr. Jarvis Lorry."
Our booked passenger showed in a moment that it was his name.
The guard, the coachman, and the two other passengers eyed him distrustfully.
使用正则表达式我需要在双引号内打印所有内容,我不想要完整的代码我只需要知道我应该如何去做,正则表达式最有用。请提示和指示!
答案 0 :(得分:3)
r'(".*?")'
将匹配双引号内的每个字符串。括号表示捕获的组,.
匹配每个字符(换行符除外),*
表示重复,?
表示非贪婪(在...之前停止匹配)下一个双引号)。如果需要,请添加re.DOTALL
选项,以使.
也匹配换行符。
答案 1 :(得分:0)
这应该这样做(下面的解释):
from __future__ import print_function
import re
txt = """The sound of a horse at a gallop came fast and furiously up the hill.
"So-ho!" the guard sang out, as loud as he could roar.
"Yo there! Stand! I shall fire!"
The pace was suddenly checked, and, with much splashing and floundering,
a man's voice called from the mist, "Is that the Dover mail?"
"Never you mind what it is!" the guard retorted. "What are you?"
"_Is_ that the Dover mail?"
"Why do you want to know?"
"I want a passenger, if it is."
"What passenger?"
"Mr. Jarvis Lorry."
Our booked passenger showed in a moment that it was his name.
The guard, the coachman, and the two other passengers eyed him distrustfully.
"""
strings = re.findall(r'"(.*?)"', txt)
for s in strings:
print(s)
结果:
So-ho!
Yo there! Stand! I shall fire!
Is that the Dover mail?
Never you mind what it is!
What are you?
_Is_ that the Dover mail?
Why do you want to know?
I want a passenger, if it is.
What passenger?
Mr. Jarvis Lorry.
r'"(.*?)"'
将匹配双引号内的每个字符串。括号表示一个捕获组,因此您只能获得没有双引号的文本。 .
匹配每个字符(换行符除外),*
表示“最后一个零或更多”,最后一个是.
。 ?
之后的*
使*
“非贪婪”,这意味着它尽可能少地匹配。如果你没有使用?
,你只能获得一个结果;包含第一个和最后一个双引号之间所有内容的字符串。
如果要提取跨行的字符串,可以包含re.DOTALL标志,以便.
也匹配换行符。如果您想这样做,请使用re.findall(r'"(.*?)"', txt, re.DOTALL)
。新行将包含在字符串中,因此您必须检查该内容。
解释与@ TigerhawkT3的答案不可避免地相似/基于投票也回答!