为什么我的网址访问失败?

时间:2012-08-18 14:05:23

标签: php python urllib

好的,所以我有一个网站,我正在制作一个python脚本,通过将数据作为GET请求发送到php脚本来将数据插入到网站中,但每当我放入一个不带字母或数字字符的脚本时(@ []; :)我得到一个urllib错误,说我在网址中没有主机:

        return urllib.urlopen("http://this-is-an-example.com/thisisadirectory/file.php?f=Hello&v="+cgi.escape("This is@A#!T33ST::::;'[]{}"))
      File "Python25\lib\urllib.py", line 82, in urlopen
        return opener.open(url)
      File "Python25\lib\urllib.py", line 190, in open
        return getattr(self, name)(url)
      File "Python25\lib\urllib.py", line 301, in open_http
        if not host: raise IOError, ('http error', 'no host given')
    IOError: [Errno http error] no host given

我也尝试使用自己的转义函数来逃避所有特殊字符(或至少一些)

    full_escape_chars = {" ": "%20",
                    "<": "%3C",
                    ">": "%3E",
                    "#": "%23",
                    "\%": "%25",
                    "{": "%7B",
                    "}": "%7D",
                    "|": "%7C",
                    "\\": "%5C",
                    "^": "%5E",
                    "~": "%7E",
                    "[": "%5B",
                    "]": "%5D",
                    "`": "%60",
                    ";": "%3B",
                    "/": "%2F",
                    "?": "%3F",
                    ":": "%3A",
                    "@": "%40",
                    "=": "%3D",
                    "&": "%26",
                    "$": "%24"}
    def full_escape(s):
        global full_escape_chars
        for key in full_escape_chars.keys():
            s = s.replace(key, full_escape_chars[key])
        return s

但是,没有。请建议如何解决这个问题!提前谢谢。

1 个答案:

答案 0 :(得分:1)

一个问题可能是cgi.escape没有按照您的想法行事;看看urllib.quote_plus

>>> import cgi
>>> import urllib
>>> s = "This is@A#!T33ST::::;'[]{}"
>>> cgi.escape(s)
"This is@A#!T33ST::::;'[]{}"
>>> urllib.quote_plus(s)
'This+is%40A%23%21T33ST%3A%3A%3A%3A%3B%27%5B%5D%7B%7D'
  

cgi.escape(s[, quote])

Convert the characters '&', '<' and '>' in string s to HTML-safe sequences. 
Use this if you need to display text that might containsuch characters in HTML.

这一般表现得更为合理:

>>> urllib.urlopen("http://this-is-an-example.com/thisisadirectory/file.php?f=Hello&v="+urllib.quote_plus("Thi
s is@A#!T33ST::::;'[]{}"))
<addinfourl at 24629656 whose fp = <socket._fileobject object at 0x16ebc30>>
>>> _.read()
'<?xml version="1.0" encoding="iso-8859-1"?>\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"\n
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n<html xmlns="http://www.w3.org/1999/xhtml
" xml:lang="en" lang="en">\n <head>\n  <title>404 - Not Found</title>\n </head>\n <body>\n  <h1>404 - Not Foun
d</h1>\n </body>\n</html>\n'
>>>