使用谷歌应用引擎在python中获取网址

时间:2014-04-12 21:23:41

标签: google-app-engine python-2.7 urllib2

我使用此代码在我的应用程序中发送Http请求,然后显示结果:

def get(self):
      url = "http://www.google.com/"
      try:
          result = urllib2.urlopen(url)
          self.response.out.write(result)
      except urllib2.URLError, e:

我希望获得google.com网页的html代码,但是我得到了这个标志">",那有什么不对?

2 个答案:

答案 0 :(得分:5)

尝试使用urlfetch服务而不是urllib2:

导入urlfetch:

from google.appengine.api import urlfetch

这在您的请求处理程序中:

def get(self):
    try:
        url = "http://www.google.com/"
        result = urlfetch.fetch(url)
        if result.status_code == 200:
            self.response.out.write(result.content)
        else:
            self.response.out.write("Error: " + str(result.status_code))
    except urlfetch.InvalidURLError:
        self.response.out.write("URL is an empty string or obviously invalid")
    except urlfetch.DownloadError:
        self.response.out.write("Server cannot be contacted")

有关详细信息,请参阅this document

答案 1 :(得分:4)

您需要调用read()方法来读取响应。也是检查HTTP状态的好方法,并在完成后关闭。

示例:

url = "http://www.google.com/"
try:
    response = urllib2.urlopen(url)

    if response.code == 200:
        html = response.read()
        self.response.out.write(html)
    else:
        # handle

    response.close()

except urllib2.URLError, e:
    pass