我有:
try:
page = requests.get(Scrape.site_to_scrape['git']+gitUser)
tree = urllib.urlopen(page).read()
soup = BS(response)
parse_git_full_name = soup.find("span", {"class":"vcard-fullname"}).get_text()
return parse_git_full_name
except:
print "Syntax: python site_scrape.py -g <git user name here>"
但是,它一直落入except:
块
我正在尝试解析像:
这样的元素<span class="vcard-fullname" itemprop="name">The name</span>
我正在尝试获取<span>
代码
答案 0 :(得分:1)
使用xpath
使用单个选择器来解决此问题。希望这有助于其他人在beautifulsoup
选择器上拔头发。
try:
page = requests.get(Scrape.site_to_scrape['git']+gitUser)
tree = html.fromstring(page.text)
full_name = tree.xpath('//span[@class="vcard-fullname"]/text()')
print 'Full Name: ', full_name
except:
print "Syntax: python site_scrape.py -g <git user name here>"