抓取网页的最后一项后为什么出现错误?

时间:2020-05-01 06:06:58

标签: python html web-scraping

我制作了一个程序,用于从newEgg检索产品名称和价格,但是在处理了网页上的最后一个产品之后,出现了一个错误,提示“属性错误:'Nonetype'对象没有属性'strong'。我'我可以肯定的是它是一个空指针错误,因为循环正在遍历所有网页元素,但是我尝试迭代到itemContainers-1,并在itemcontainers-1的循环中设置断点,但仍然不起作用。另外,我应该在哪里将Client.close()放在最后一点呢?

import bs4
#uReq is our arbitrary shorthand for urllib.request
import urllib
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

#The URL we plan to use
my_url = 'https://www.newegg.com/'

#uReq(my_url) opens up web client
Client = uReq(my_url)
#uClient.read dumps everything out of the url
html_page = Client.read()
Client.close()


page_soup = soup(html_page, "html.parser")
itemContainers= page_soup.findAll("div", {"class":"item-container"})

for i in range(0,len(itemContainers)):
    if i is len(itemContainers)-1:
        {
            breakpoint
        }
    #itemTitles is a list of all of the titles found on the web page
    itemTitles = page_soup.findAll("a", {"class": "item-title"})

    divWithPriceInfo = itemContainers[i].find("ul", "price")
    left_Dec = divWithPriceInfo.strong.text
    right_Dec = divWithPriceInfo.sup.text
    stringStrong = str(left_Dec)
    stringSup = str(right_Dec)
    print(itemTitles[i].text)
    print(stringStrong + stringSup)

0 个答案:

没有答案