我有这段代码:
import urllib
from bs4 import BeautifulSoup
url = 'http://www.brothersoft.com/synthfont-159403.html'
pageHtml = urllib.urlopen(url).read()
soup = BeautifulSoup(pageHtml)
for a in soup.select('div.Updated.coLeft ul a[href]'):
print a.string
但它给了我这个输出:
Kenneth Rundt
我需要的是更新的coleft类中的所有信息。 我该怎么办?
答案 0 :(得分:2)
获取li
元素:
>>> for li in soup.select('div.Updated.coLeft li'):
... print ' '.join(li.stripped_strings)
...
Last Updated: Dec 27, 2012
License: Freeware Free
OS: Windows 7/Vista/XP
Requirements: No special requirements
Publisher: Kenneth Rundt (4 more Applications)