我是Python的新手,我试图从网页中提取信息(http://findanrd.eatright.org/listing/search?zipCode=page=1)。
它可以从“信息页面”获取所有链接,但我无法提取这些信息。
<div class="user-info-box clearfix">
<dl class="details-left">
<dl class="details-left">
<dl class="details-right">
<dd>26850 Providence Parkway, Suite 425</dd>
<dd>Novi, MI 48374</dd>
<dd>Email: info@aartibatavia.com</dd>
<dd>
Website:
<a href="http://www.aartibatavia.com/" target="_blank">www.aartibatavia.com/</a>
</dd>
</dl>
我想提取上述信息,例如街道,电子邮件地址和网页。我的代码如下所示:
import requests
from bs4 import BeautifulSoup
def nutrispider(max_pages):
page = 1
while page <= max_pages:
url = 'http://findanrd.eatright.org/listing/search?zipCode=&page=' + str(page)
source_code = requests.get(url)
text = source_code.text
soup = BeautifulSoup(text)
x = 0
while x<=19:
rows = soup.findAll('tr', {'data-index':x})
for row in rows:
link_elm = row.find('div', {'class':'search-address-list-address'}).a
link = 'http://findanrd.eatright.org' + link_elm['href']
users = soup.findAll('div', {'class': 'user-info-box clearfix'})
for user in users:
information = user.find('dd')
text = information.get_Text()
print(text)
print(link)
x += 1
page += 1
nutrispider(1)
目前没有错误,但它只是打印到信息所在的子页面的链接。
答案 0 :(得分:0)
import requests, bs4
url = 'http://findanrd.eatright.org/listing/search?zipCode=page=1'
r = requests.get(url)
soup = bs4.BeautifulSoup(r.text, 'lxml')
for tr in soup.table('tr'):
address = tr.find(class_='search-address-list-address').get_text(strip=True).strip('View details')
name = tr.find(class_='search-address-list-name').get_text(strip=True)
link = tr.p.a['href']
print(name, address, link)
出:
Aarti Batavia, MS RD IFMCP 26850 Providence Parkway, Suite 425Novi, MI 48374 http://maps.google.com/maps?saddr=&daddr=26850 Providence Parkway, Suite 425 Novi, MI 48374
Aarti Batavia, MS RD IFMCP 26850 Providence Parkway, Suite 425Novi, MI 48374 http://maps.google.com/maps?saddr=&daddr=26850 Providence Parkway, Suite 425 Novi, MI 48374
Abbey Carlson, RD 3935 N 75 WHyde Park, UT 84318 http://maps.google.com/maps?saddr=&daddr=3935 N 75 W Hyde Park, UT 84318
Abbi Kifer, MED RDN LD PO Box 120Mount Storm, WV 26739 http://maps.google.com/maps?saddr=&daddr=PO Box 120 Mount Storm, WV 26739
Abbie Scott, RD LD Hy-Vee, Inc.3221 SE 14th StreetDes Moines, IA 50320 http://maps.google.com/maps?saddr=&daddr=3221 SE 14th Street Des Moines, IA 50320