Question

我正在尝试从/app1获取视频链接。当我检查元素时，它会显示每个视频的源html代码。在使用

检索的源代码中

<iframe name="myIframe" id="myIframe" style="position:fixed;margin-top:0px;width:100%;height:100%" frameborder="0" src="another.html" scrolling="no" runat =server></iframe>

它不显示视频的html源。还有其他方法吗？

'https://www.youtube.com/trendsdashboard#loc0=ind'

当我们从浏览器检查元素时会出现这个简单的代码，但urllib2.urlopen("https://www.youtube.com/trendsdashboard#loc0=ind").read()

重新检索的源代码中没有

Answer 1

适合我...

<?php if (!empty($_SESSION['msg']) : ?>
    <div class='msg'><?php echo $_SESSION['msg']; ?></div>
<?php endif; ?>

IMO我使用import urllib2 url = 'https://www.youtube.com/trendsdashboard#loc0=ind' html = urllib.urlopen(url).read()代替requests - 它更容易使用：

urllib

修改

根据您的修改，这将为您提供所有带有超链接的import requests url = 'https://www.youtube.com/trendsdashboard#loc0=ind' response = requests.get(url) html = response.content代码列表。我使用库BeautifulSoup来解析html：

<a></a>

Answer 2

要查看源代码，您需要使用read方法如果你只是使用open，它会给你这样的东西。

In [12]: urllib2.urlopen('https://www.youtube.com/trendsdashboard#loc0=ind')
Out[12]: <addinfourl at 3054207052L whose fp = <socket._fileobject object at 0xb60a6f2c>>

要查看来源使用read

urllib2.urlopen('https://www.youtube.com/trendsdashboard#loc0=ind').read()

Answer 3

每当您比较Python代码和Web浏览器之间的源代码时，不要通过Insect元素进行比较，右键单击网页并单击查看源代码，然后您将找到实际的源代码。 Inspect Element显示由创建的网络请求以及正在执行的javascript代码返回的聚合源代码。

在打开网页之前保持开发者控制台处于打开状态，请保持在“网络”标签页上，并确保“保留日志”＃39;对Chrome或“坚持”开放对于Firefox中的Firebug，您将看到所有网络请求。

Answer 4

我们还需要将数据解码为utf-8。这是代码：

只是使用 response.decode（＆＃39; UTF-8＆＃39;）打印（响应）

使用urllib.urlopen（）无法获得的网页源代码

4 个答案: