我想以表格方式显示网页的内容:http://movie.webindia123.com/movie/showtimes/asp/search_result.asp?language=57&district_name=42&city_name=118但是当我使用汤时,身体标签似乎被每个角色之间的空间损坏。我使用的源代码:
from bs4 import BeautifulSoup
import requests
url="http://movie.webindia123.com/movie/showtimes/asp/search_result.asp?language=57&district_name=42&city_name=118"
r = requests.get(url)
soup = BeautifulSoup(r.text)
print soup
for hit in soup.findAll(attrs={'class' :'section group'}):
text=hit
print text.get_text()
答案 0 :(得分:0)
请使用JSON模块访问Web文档,之后使用漂亮的汤解析文档。 下面给出了代码片段:
#Get HTML
cj = cookielib.CookieJar()
browser = mechanize.Browser()
cj = mechanize.LWPCookieJar()
browser.set_cookiejar(cj)
#browser = mechanize.OpenerFactory().build_opener(mechanize.HTTPCookieProcessor(cj))
#request = mechanize.Request(url)
response = browser.open(url)
html = response.read()
browser.select_form(name="trace")
browser["mobilenumber"] = str(site)
browser.submit()
html=browser.response().read()
#print browser.geturl()
#print html
#Parse HTML with BeautifulSoup
soup = BeautifulSoup(html,"lxml")