我在此问题上收到的输出出现问题。基本上,我有一个文本文件(https://www.py4e.com/code3/mbox.txt),我尝试首先让python打印在其中找到多少个电子邮件地址,然后在随后的行中打印每个地址。我的输出示例如下:
Received: (from apache@localhost)
There were 22003 email addresses in mbox.txt
for source@collab.sakaiproject.org; Thu, 18 Oct 2007 11:31:49 -0400
There were 22004 email addresses in mbox.txt
X-Authentication-Warning: nakamura.uits.iupui.edu: apache set sender to zach.thomas@txstate.edu using -f
There were 22005 email addresses in mbox.txt
我在这里做错了什么?这是我的代码
fhand = open('mbox.txt')
count = 0
for line in fhand:
line = line.rstrip()
if '@' in line:
count = count + 1
print('There were', count, 'email addresses in mbox.txt')
if '@' in line:
print(line)
答案 0 :(得分:0)
您能否更清楚地将预期输出与实际输出进行比较?
您有两个if '@' in line'
语句应合并;没有理由问同样的问题两次。
计算包含@
符号的行数,然后每行打印当前计数。
如果您只想打印一次计数,则将其放在for循环的外面(之后)。
如果您要打印电子邮件地址而不是包含它们的整行,那么您将需要执行更多字符串处理以从该行中提取电子邮件。
完成操作后,别忘了关闭文件。
答案 1 :(得分:0)
以下内容修改了您的代码,以使用正则表达式在文本行中查找电子邮件。
import re
# Pattern for email
# (see https://www.geeksforgeeks.org/extracting-email-addresses-using-regular-expressions-python/)
pattern = re.compile(r'\S+@\S+')
with open('mbox.txt') as fhand:
emails = []
for line in fhand:
# Detect all emails in line using regex pattern
found_emails = pattern.findall(line)
if found_emails:
emails.extend(found_emails)
print('There were', len(emails), 'email addresses in mbox.txt')
if emails:
print(*emails, sep="\n")
输出
There were 44018 email addresses in mbox.txt
stephen.marquard@uct.ac.za
<postmaster@collab.sakaiproject.org>
<200801051412.m05ECIaH010327@nakamura.uits.iupui.edu>
<source@collab.sakaiproject.org>;
<source@collab.sakaiproject.org>;
<source@collab.sakaiproject.org>;
apache@localhost)
source@collab.sakaiproject.org;
stephen.marquard@uct.ac.za
source@collab.sakaiproject.org
....
....
...etc...