如何处理列表中的每个项目

时间:2017-07-09 16:15:29

标签: python python-2.7 python-3.x

大家好我试图从多个网站中提取文字 一切正常,但是当我运行脚本时,我只提取一个网站 来自我创建的有3个网站的域名(列表) 我做错了什么我需要将所有domian项提取到文件

由于

from bs4 import BeautifulSoup
import requests
import urllib3
import certifi

http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED',
    ca_certs=certifi.where())

domain =('https://www.betfair.com/exchange/', 'https://docs.python.org/3/library/urllib.parse.html','https://anaconda.org/pypi/urllib3')
for url in domain:
    page = requests.get(url, verify=True)
    soup = BeautifulSoup(page.content, 'html.parser')
    content = (soup.get_text().encode('utf-8'))
    with open("article.txt", "w") as wa, open("article.txt", "r") as ra, open('outfile.txt', "w") as outfile:
        wa.write(content)
        for line in ra:
            if not line.strip(): continue
            outfile.write(line)

1 个答案:

答案 0 :(得分:-1)

我相信你每次都会覆盖这个文件。这就是你应该以追加模式打开文件的原因,如下所示:

with open('filename.txt', 'a'):
    ...

希望有所帮助