BeautifulSoup中来自for循环的重复元素

时间:2016-10-12 19:49:43

标签: python web-scraping beautifulsoup

鉴于代码:

import urllib
import urllib.request
from bs4 import BeautifulSoup
from urllib.request import urlopen

def make_soup(url):
    thepage = urllib.request.urlopen(url)
    soupdata = BeautifulSoup(thepage, "html.parser")
    return soupdata

soup = make_soup("https://www.wellstar.org/locations/pages/wellstar-acworth-practices.aspx")
for table in soup.findAll("table", class_ = "s4-wpTopTable"):
    for specialty in table.findAll("div", class_ = "PurpleBackgroundHeading"):
        specialty = specialty.get_text(strip = True)
for name in table.findAll(class_ = "WS_Location_Name"):
    name = name.get_text()
    print(specialty, " - ", name)

此代码生成正确循环的位置名称以及不正确循环的专业名称。例如,前面的代码生成:

Urology - Center for Spine Interventions, PC
Urology - WellStar Medical GroupCardiovascular Medicine

Urology - Georgia Urology是生成的最后一对。我怎样才能确保创建与现实相对应的专业和位置名称对?

1 个答案:

答案 0 :(得分:-1)

它可能是缩进吗? 如何嵌套for循环? 在python我会期待类似的东西:

for tabele ...:
    for specialty ...:
        spe..
    for name ...:
        name ...
        print ...

for tabele ...:
    for specialty ...:
        spe..
        for name ...:
            name ...
            print ...