I have been trying to extract the name from a twitter profile, the only problem I'm having is that beautifulsoup grabs the entire element. I have tried the {"class":}
to specify the element but whenever I do this it results in getting
AttributeError: 'NoneType' object has no attribute 'text' error.
My code:
url = "https://twitter.com/barackobama"
html_doc = urllib.request.urlopen(url)
soup = BeautifulSoup(html_doc, 'lxml')
name = soup.find('h1').text
print(name)
答案 0 :(得分:4)
如果要从标题的子链接中获取文本而不是完整的标题文本,请尝试
url = "https://twitter.com/barackobama"
html_doc = urllib.request.urlopen(url)
soup = BeautifulSoup(html_doc, 'lxml')
name = soup.find('h1').a.text
print(name)
# 'Barack Obama'