应用错误收集

我正在使用BeautifulSoup4来解析网址中的html内容。但似乎在解析之后，一些标签会从实际的html内容中删除。请检查下面的代码：

from bs4 import BeautifulSoup
import urllib2

site = urllib2.urlopen('http://www.shazam.com/charts/usa_top_100')
html_content = site.read()             #Check actual HTML content
soup = BeautifulSoup(html_content)     #Compare the result with that one in previous step.
track_tags = soup.find_all('article')  #Returns an empty list for '<article>' tags evenif it exists in html content.

我的要求是解析所有文章＆＃39;标记及其URL中的内容。

由于

Python-BeautifulSoup4错误：解析后会删除一些标记

0 个答案: