Python - 没有获取视频URL

时间:2016-10-13 17:28:02

标签: python beautifulsoup lxml

我正在尝试使用BeautifulSoup和lxml解析器提取页面中嵌入的视频URL。我搜索了很多以获得正确的PYTHON代码,但是在很多天之后还没有成功。到目前为止我的代码如下 -

import re
from bs4 import BeautifulSoup
import requests

url = 'http://telly-loans.com/watchvideo.php?id=0kw7cgyat4p7'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36'}

with requests.Session() as session:
    session.headers = headers
    response = session.get(url)
    soup = BeautifulSoup(response.content, "lxml")
    response = session.get(soup.iframe['src'], headers={'Referer': url})
    soup = BeautifulSoup(response.content, "lxml")

    print re.search(r'http:\/\/"(.*?)"', soup.script.text).group(1)

我的python经验非常有限,必须在这里遗漏一些东西。有没有人尝试过类似的页面,可以帮助我完成这项工作。

PAGE网址 - http://telly-loans.com/watchvideo.php?id=0kw7cgyat4p7

部分带有IFRAME的HTML

<td>
   <div id='div-gpt-ad-1466152083933-1' style='height:600px; width:160px;'>
     <script type='text/javascript'>googletag.cmd.push(function() { googletag.display('div-gpt-ad-1466152083933-     1'); });
     </script>
   </div><!-- END TAG --
</td>
<td>
   <IFRAME SRC="http://watchvideo2.us/embed-0kw7cgyat4p7-540x304.html" FRAMEBORDER=0 MARGINWIDTH=0 MARGINHEIGHT=0 SCROLLING=NO WIDTH=540 HEIGHT=304>   </IFRAME>
      </td>

0 个答案:

没有答案