Question

嗨，我有一个python脚本，它将访问某个网站并搜索某些标签内的字符串并进行打印。我的屏幕打印后会看起来像这样 - textidontwant textiwanthere.com 我怎样才能搜索.com并在它之前打印一些字符，以便只显示textiwanthere.com而不是全部。这是我的代码 -

import urllib.request
import re
import os

url = "http://www.throwawaymail.com/"

request = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
sourcecode = urllib.request.urlopen(request).read()
output = sourcecode.decode("utf-8")

findemail = re.findall('>(.*?)</span>', str(output))

print(findemail)

os.system("pause")

我想搜索＆＃34; findemail＆＃34;为了它我想打印phamepracl@throwam.com，但每次都不同，但长度是一样的，这就是我的控制台所说的 -

[＆＃39;切换导航＆＃39;，＆＃39;＆＃39;＆＃39;＆＃39;，＆＃39;＆＃39;，＆＃39;＆＃39;，＆＃39; phamepracl@throwam.com']

Answer 1

只需打印列表的最后一个条目

print(findemail)[-1]

如果您不想要其他内容，也可以将此值指定给findmail

findemail = re.findall('>(.*?)</span>', str(output))[-1]

这对我有用：

import urllib.request
import re
import os

url = "http://www.throwawaymail.com/"

request = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
sourcecode = urllib.request.urlopen(request).read()
output = sourcecode.decode("utf-8")

findemail = re.findall('>(.*?)</span>', str(output))

print(findemail[-1])

Answer 2

这是我的解决方案：

for i in findemail:
    if i.find('.com')>=0:
        print(i)

输出：

hudininona@throwam.com

如何在python中的字符串之前打印一定数量的字符

2 个答案: