我目前正在使用mechanize来自动浏览网站上的某些链接。问题是机械化找不到我网站上的所有链接。当我使用时:
>>> import mechanize
>>> web = mechanize.Browser()
>>> r = web.open('http://torrent.ajee.sh/hash.php?hash=ee59bf932540976857c38eee56e2a598154a9963')
>>> print r.read()
它实际打印出来:
Adulterers.2015.HDRip.XviD.AC3-EVO (1.38 GB)<br/>Files: <br/>O <a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=0' target='_top' >Adulterers.2015.HDRip.XviD.AC3-EVO.avi </a> (1.4 GB) -- (<font color=''>100% </font> Cached)<br/>O <a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=1' target='_top' >Adulterers.2015.HDRip.XviD.AC3-EVO.nfo </a> (2 KB) -- (<font color=''>100% </font> Cached)<br/>O <a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=3' target='_top' >sample.avi </a> (16.1 MB) -- (<font color=''>100% </font> Cached)<br/>O <a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=2' target='_top' >Torrent Downloaded From ExtraTorrent.cc.txt </a> (338 B ) -- (<font color=''>100% </font> Cached)<br/><br/><br/><a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=zip' target='_top'>Download Zip</a> (1.38 GB)<br/><br/> If Download links not working, then Try again after few mints, <b>Files are been Cached(100%)</b>.<br/><br/>
并且共有5个链接!
但是当我使用时:
print list(web.links())
它只包含该源中的第一个链接!有什么问题?
[Link(base_url='http://torrent.ajee.sh/hash.php?hash=ee59bf932540976857c38eee56e2a598154a9963', url='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=0', text='Adulterers.2015.HDRip.XviD.AC3-EVO.avi', tag='a', attrs=[('href', '/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=0'), ('target', '_top')])]
抱歉我的英语!
答案 0 :(得分:0)
你必须遍历链接:
for link in web.links():
print(link.text, link.url)
links()是一个生成器,每次请求另一个链接时都会为您提供下一个链接。