Question

我正在尝试使用以下代码获取html页面的内容：

import urllib2
request = urllib2.Request(req_url)
response = urllib2.urlopen(request)
response.read()

它返回以下响应而不是原始页面：

<html xmlns:lxslt="http://xml.apache.org/xslt" xmlns:stringutils="xalan://org.apache.tools.ant.util.StringUtils">
<head>
<META http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<title>Test Results</title>
</head>
<frameset cols="20%,80%">
<frameset rows="30%,70%">
<frame src="overview-frame.html" name="packageListFrame">
<frame src="allclasses-frame.html" name="classListFrame">
</frameset>
<frame src="overview-summary.html" name="classFrame">
<noframes>
<h2>Frame Alert</h2>
<p>
                This document is designed to be viewed using the frames feature. If you see this message, you are using a non-frame-capab
le web client.
            </p>
</noframes>
</frameset>
</html>

如何获取原始html页面的内容？

如何在python中获取带有框架的html页面

0 个答案: