如何在文件的请求中运行多个URL

时间:2017-10-04 20:09:37

标签: python-2.7 web-scraping beautifulsoup python-requests

我试图从txt文件中的网址中删除多个网站。每行有一个网址。

我的代码是:

Import requests
from bs4 import BeautifulSoup

file = open('url.txt', 'r')
filelines = file.readline()
urllist = requests.get(filelines)
soup = BeautifulSoup(urllist.content, "html.parser")
content = soup.find_all("span", {"class": "title-main-info"})
print content

但它只打印最后一个网址内容(最后一行)。我做错了什么? 感谢

1 个答案:

答案 0 :(得分:1)

试试这个。它应该工作:

import requests
from bs4 import BeautifulSoup

with open('url.txt', 'r') as f:
    for links in f.readlines():
        urllist= requests.get(links.strip())
        soup = BeautifulSoup(urllist.content, "html.parser")
        content = soup.find_all("span", {"class": "title-main-info"})
        print content