Python 3.6.3 Anaconda / Spyder不再生成打印输出

时间:2018-01-23 15:25:17

标签: python printing beautifulsoup spyder

当我尝试使用Spyder运行此代码时,没有任何反应。 我没有收到任何错误,只是没有打印输出:

import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'https://www.newegg.com/Power-Banks/SubCategory/ID-3724?cm_sp=Cat_Batteries-Power-Banks-Chargers_1-_-VisNav-_-Power-Banks'

# opening up connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

# html parsing
page_soup = soup(page_html, "html.parser")

# grab products
containers = page_soup.findAll("div",{"class":"item-container"})

for container in containers:
    brand = container.div.div.a.img["title"]


# get the product name
    title_container = container.findAll("a", {"class":"item-title"})
    product_name = title_container[0].text # search the text in the first index of the list of <a></a> 

# find shipping prices
    shipping_container = container.findAll("li", {"class":"price-ship"})
    shipping = shipping_container[0].text.strip()

    print("brand:"  + brand)
    print("product_name:" + product_name)
    print("shipping:" + shipping)

这可能是什么问题?

1 个答案:

答案 0 :(得分:0)

测试时,containers确实包含了一组有效的结果。虽然有其他问题。并非所有容器都有合适的.div.div.a.img["title"]元素:

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'https://www.newegg.com/Power-Banks/SubCategory/ID-3724?cm_sp=Cat_Batteries-Power-Banks-Chargers_1-_-VisNav-_-Power-Banks'

# opening up connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

# html parsing
page_soup = soup(page_html, "html.parser")

# grab products
containers = page_soup.findAll("div", {"class":"item-container"})

if len(containers) == 0:
    print(page_html)        # diagnose reason for no containers
else:
    for container in containers:
        try:
            brand = container.div.div.a.img["title"]
        except:
            pass
        else:
            # get the product name
            title_container = container.findAll("a", {"class":"item-title"})
            product_name = title_container[0].text # search the text in the first index of the list of <a></a> 

            # find shipping prices
            shipping_container = container.findAll("li", {"class":"price-ship"})
            shipping = shipping_container[0].text.strip()

            print("brand:"  + brand)
            print("product_name:" + product_name)
            print("shipping:" + shipping)

这可以通过使用异常处理来解决。这给出了以下类型的结果:

brand:Orico
product_name:[Qualcomm Certified Quick Charge 3.0] ORICO TS1-BK 10000 mAh QC3.0 & USB-C / Type-C Port Portable Charger External Battery Pack Power Bank for Phones, Tablet and More
shipping:Free Shipping
brand:Duracell Powermat
product_name:Duracell Powermat White 2X Charging Mat M2PW1
shipping:Free Shipping
brand:ADATA
product_name:ADATA D8000L 8000mAh w/ 200 Lumens LED (AD8000L-5V-CBK)
shipping:Free Shipping
brand:Mophie
product_name:mophie Juice Pack Powerstation Green
shipping:Free Shipping
brand:SAMSUNG
product_name:Fast Charge Battery Pack (10.2A), Black
shipping:$7.73 Shipping