如何修复AttributeError:'HTMLParserTreeBuilder'对象没有属性'initialize_soup'

时间:2019-06-05 21:08:02

标签: python python-3.x beautifulsoup

我遇到以下错误:

AttributeError: 'HTMLParserTreeBuilder' object has no attribute 'initialize_soup'

我试图在eBay(link)中找到M复选框的xpath

我正在使用spyder,并且已尽可能导入bs4。

import requests
from bs4 import BeautifulSoup

web_page = requests.get('https://www.ebay.com/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=mens+shirt&_sacat=0')
web_soup = BeautifulSoup(web_page.text, 'html.parser')
checkbox = soup.find(class_='cbx x-refine__multi-select-checkbox')
checkbox_names = checkbox.find_all('a')

for check in checkbox_names:
    print(check.prettify())

我期待这样的输出

<a href="/web/20121007172955/https://www.nga.gov/cgi-bin/tsearch?artistid=11630">
 Zabaglia, Niccola
</a>

我正在关注this tutorial来帮助我编写代码。

1 个答案:

答案 0 :(得分:0)

首先,您的soup.find(class_='cbx x-refine__multi-select-checkbox')实际上仅选择类cbx x-refine__multi-select-checkbox的第一个元素

要获取“ M尺寸衬衫”的网址,您可以执行以下操作:

代码:

import requests
from bs4 import BeautifulSoup as soup

web_page = requests.get('https://www.ebay.com/sch/i.html?_from=R40&_trksid=m570.l1313&_nkw=mens+shirt&_sacat=0')
web_soup = soup(web_page.text, 'html.parser')
links = web_soup.find_all('a', {'class':'cbx x-refine__multi-select-link'})
for l in links:
    checkbox = l.find('input',{'class':'cbx x-refine__multi-select-checkbox '})
    if checkbox and 'M' == checkbox.get('aria-label'):
        #FOUND
        print(l.get('href'))

输出:

https://www.ebay.com/sch/i.html?_from=R40&_nkw=mens+shirt&_sacat=0&rt=nc&Size%2520%2528Men%2527s%2529=M&_dcat=185100