使用Python urllib2使用XML有效负载进行身份验证的HTTP POST

时间:2010-07-02 10:40:32

标签: python http post ironpython urllib2

我正在尝试使用IronPython中的urllib2发送带有纯XML负载的POST消息(我认为)。但是,每次我发送它时,它都会返回错误代码400(错误请求)。

我实际上是在尝试模仿一个Boxee删除队列项调用,实际数据包看起来像这样(来自WireShark):

POST /action/add HTTP/1.1
User-Agent: curl/7.16.3 (Windows  build 7600; en-US; beta) boxee/0.9.21.11487
Host: app.boxee.tv
Accept: */*
Accept-Encoding: deflate, gzip
Cookie: boxee_ping_version=9; X-Mapping-oompknoc=76D730BC9E858725098BF13AEFE32EB5; boxee_app=e01e36e85d368d4112fe4d1b6587b1fd
Connection: keep-alive
Content-Type: text/xml
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Accept-Language: en-us,en;q=0.5
Keep-Alive: 300
Connection: keep-alive
Content-Length: 53

<message type="dequeue" referral="3102296"></message>

我正在使用以下python代码发送POST:

def PostProtectedPage(theurl, username, password, postdata):

    req = urllib2.Request(theurl, data=postdata)
    req.add_header('Content-Type', 'text/xml')
    try:
        handle = urllib2.urlopen(req)
    except IOError, e:                  # here we are assuming we fail
        pass
    else:                               # If we don't fail then the page isn't protected
        print "This page isn't protected by authentication."
        sys.exit(1)

    if not hasattr(e, 'code') or e.code != 401:                 # we got an error - but not a 401 error
        print "This page isn't protected by authentication."
        print 'But we failed for another reason.'
        sys.exit(1)

    authline = e.headers.get('www-authenticate', '')                # this gets the www-authenticat line from the headers - which has the authentication scheme and realm in it
    if not authline:
        print 'A 401 error without an authentication response header - very weird.'
        sys.exit(1)

    authobj = re.compile(r'''(?:\s*www-authenticate\s*:)?\s*(\w*)\s+realm=['"](\w+)['"]''', re.IGNORECASE)          # this regular expression is used to extract scheme and realm
    matchobj = authobj.match(authline)
    if not matchobj:                                        # if the authline isn't matched by the regular expression then something is wrong
        print 'The authentication line is badly formed.'
        sys.exit(1)
    scheme = matchobj.group(1) 
    realm = matchobj.group(2)
    if scheme.lower() != 'basic':
        print 'This example only works with BASIC authentication.'
        sys.exit(1)

    base64string = base64.encodestring('%s:%s' % (username, password))[:-1]
    authheader =  "Basic %s" % base64string
    req.add_header("Authorization", authheader)
    try:
        handle = urllib2.urlopen(req)
    except IOError, e:                  # here we shouldn't fail if the username/password is right
        print "It looks like the username or password is wrong."
        print e
        sys.exit(1)
    thepage = handle.read()
    return thepage

然而,每当我运行它时,它返回错误400(错误请求)
我知道身份验证是正确的,因为我在其他地方使用它来获取队列(我无法想象它没有被使用,否则现在如何将更改应用到哪个帐户?)

看一下网络捕获,我可能只是缺少为请求添加一些标头吗?可能是一些简单的事情,但我对python或HTTP请求知之甚少不知道是什么。

编辑:顺便说一句,我正在调用代码如下(它实际上是动态的,但这是基本的想法):

PostProtectedPage("http://app.boxee.tv/action/add", "user", "pass", "<message type=\"dequeue\" referral=\"3102296\"></message>")

1 个答案:

答案 0 :(得分:0)

这对我来说很好:

curl -v -A 'curl/7.16.3 (Windows  build 7600; en-US; beta) boxee/0.9.21.11487' \
 -H 'Content-Type: text/xml' -u "USER:PASS" \
 --data '<message type="dequeue" referral="12573293"></message>' \
 'http://app.boxee.tv/action/add'

但是如果我尝试删除当前不在队列中的引用ID,我会得到400 Bad Request。如果您使用的是与Wireshark中检测到的相同的推荐ID,那么您很可能也会发生这种情况。使用

wget -nv -m -nd --user=USER --password=PASS http://app.boxee.tv/api/get_queue

确保您要删除的内容实际上在队列中。