我有4个网址..现在我需要网址的袖子细节。袖子细节改变位置,因此存储它的节点也会改变...对于第一个网址,袖子位于第2位置其他三个网址是第3位......我需要输出如下......
URLS Sleeves
http://www.jabong.com/belle-fille-Green-Solid-Winter-Jacket-1310755.html?pos=5&cid=BE797WA44OZRINDFAS Full Sleeves
http://www.jabong.com/oxolloxo-Off-White-Solid-Reversible-Blazer-2687327.html?pos=8&cid=OX344WA72XITINDFAS Long Sleeve
http://www.jabong.com/oxolloxo-Multicoloured-Checked-Blazer-2784283.html?pos=16&cid=OX344WA16KTVINDFAS 3/4th Sleeves
http://www.jabong.com/mirika-Blue-Embellished-WINTER-JACKET-2754538.html?pos=19&cid=MI137WA61STUINDFAS Sleeveless
以下是我的代码部分:
for 1st url : soup.find_all("span", {"class":"product-info-left"})[1].next_sibling.text
for 2nd to 4th url : soup.find_all("span", {"class":"product-info-left"})[2].next_sibling.text
答案 0 :(得分:1)
soup.find("span", text="Sleeves").next_sibling.text
答案 1 :(得分:0)
您只能找到包含'Sleeve'
的字符串。
def check(text):
return type(text) != type(None) and text.find('Sleeve') > -1
sleeves = soup.find_all(string=check)
print(sleeves[1])
输出
Full Sleeves
Long Sleeve
3/4th Sleeves
Sleeveless
要使用功能学习过滤,请检查此link。