对于链接,
http://www.jabong.com/Adidas-Base-Mid-Dd-Blue-Round-Neck-T-Shirt-2733238.html
...我需要获得产品结构细节,“Polyster”。但我得“Fabric”作为输出。以下是代码的一部分。
soup.find_all("span", {"class":"product-info-left"})[0].text
答案 0 :(得分:1)
找到您的节点next_sibling
。
soup.find_all("span", {"class":"product-info-left"})[0].next_sibling.text
答案 1 :(得分:0)
您可以在此处使用.next
或.next_sibling
:
>>> soup.find_all("span", {"class":"product-info-left"})[0].next.next.text
'Polyester'
>>> soup.find_all("span", {"class":"product-info-left"})[0].next_sibling.text
'Polyester'
答案 2 :(得分:0)
您需要的信息位于ul
标记下的ul
标记中,您应首先找到li
,然后您可以获取{{1}中的所有文字使用stripped_strings
In [47]: r = requests.get('http://www.jabong.com/Adidas-Base-Mid-Dd-Blue-Round-Neck-T-Shirt-2733238.html')
In [48]: soup = BeautifulSoup(r.text, 'lxml')
In [49]: ul = soup.find('ul', class_="prod-main-wrapper")
In [50]: for li in ul.find_all('li'):
...: print(list(li.stripped_strings))
...:
['Fabric', 'Polyester']
['Sleeves', 'Half Sleeves']
['Neck', 'Round Neck']
['Fit', 'Regular']
['Color', 'Blue']
['Style', 'Solid']
['SKU', 'AD004MA61NGOINDFAS']
['Model Stats', 'This model has height 6\'0",Chest 38",Waist 34"and is Wearing Size M.']
['Authorization', 'Adidas authorized online sales partner.', 'View Certificate']
如果您只想要第一行,则可以使用find()
,它会返回find_all()
中的fists元素:
In [51]: text = ul.find('li').stripped_strings
In [52]: print(list(text))
['Fabric', 'Polyester']