我正在尝试使用python从<span class= ''>
抓取。我正在抓取的页面上的代码如下:
<li class="item">
<span class="name">Sara</span>
<span class="value">selling potato in town</span>
</li>
<li class="item">
<span class="name">Grouping</span>
<span class="value">clothes</span>
</li>
<li class="item">
<span class="name">Phone</span>
<span class="value">
04142018071 09128983727
</span>
</li>
我需要获取的是“萨拉”和“在镇上卖土豆”,“电话”和“ 04142018071 09128983727 ” 你能帮我吗?
我尝试以下代码:
for stng1 in soup.find_all('li', class_='item'):
for stng in stng1.find_all('span'):
#print (stng)
if stng.has_attr("class"):
if stng['class'] == 'name':
print (stng.string)
答案 0 :(得分:0)
from bs4 import BeautifulSoup
html_doc = """
<li class="item">
<span class="name">Sara</span>
<span class="value">selling potato in town</span>
</li>`
"""
soup = BeautifulSoup(html_doc, 'html.parser')
Content = soup.find("li",{"class":"item"})
name=(Content.find("span",{"class":"name"}).get_text())
value=(Content.find("span",{"class":"value"}).get_text())
print(name)
print(value)
答案 1 :(得分:0)
尝试
from simplified_scrapy.simplified_doc import SimplifiedDoc
doc = SimplifiedDoc(html)
lst = doc.getElements(tag='li',value='item')
for i in lst:
i = i.getChildren()
for j in i:
print ('%s=%s' % (j['class'],j.text))