如何将requests.RequestsCookieJar转换为字符串

时间:2018-04-12 02:25:25

标签: python python-requests

我有一个requests.RequestCookieJar对象,其中包含来自不同域/路径的多个Cookie。如何根据here中提到的规则提取特定域/路径的Cookie字符串?

例如

>>> r = requests.get("https://stackoverflow.com")
>>> print(r.cookies)
<RequestsCookieJar[<Cookie prov=4df137f9-848e-01c3-f01b-35ec61022540 for .stackoverflow.com/>]>

# the function I expect
>>> getCookies(r.cookies, "stackoverflow.com")
"prov=4df137f9-848e-01c3-f01b-35ec61022540"

>>> getCookies(r.cookies, "meta.stackoverflow.com")
"prov=4df137f9-848e-01c3-f01b-35ec61022540"
# meta.stackoverflow.com is also satisfied as it is subdomain of .stackoverflow.com

>>> getCookies(r.cookies, "google.com")
""
# r.cookies does not contains any cookie for google.com, so it return empty string

4 个答案:

答案 0 :(得分:2)

实际上,当我遇到和你一样的问题时。但是当我访问 Class Define

class RequestsCookieJar(cookielib.CookieJar, MutableMapping):

我找到了一个名为 def get_dict(self, domain=None, path=None): 的函数 你可以简单地写这样的代码

raw = "rawCookide"
print(len(cookie))
mycookie = SimpleCookie()
mycookie.load(raw)
UCookie={}
for key, morsel in mycookie.items():
    UCookie[key] = morsel.value

答案 1 :(得分:1)

新答案

好的,所以我仍然没有得到你想要实现的目标。

如果你想从requests.RequestCookieJar对象中提取原始网址(这样你就可以检查是否与给定的子域匹配),这是(据我所知)不可能的。

但是,你可以做一些类似的事情:

#!/usr/bin/env python3
# -*- coding: UTF-8 -*-

import requests
import re

class getCookies():

    def __init__(self, url):

        self.cookiejar = requests.get(url).cookies
        self.url = url

    def check_domain(self, domain):

        try:

            base_domain = re.compile("(?<=\.).+\..+$").search(domain).group()

        except AttributeError:

            base_domain = domain

        if base_domain in self.url:

            print("\"prov=" + str(dict(self.cookiejar)["prov"]) + "\"")

        else:

            print("No cookies for " + domain + " in this jar!")

然后,如果你这样做:

new_instance = getCookies("https://stackoverflow.com")

然后你可以这样做:

new_instance.check_domain("meta.stackoverflow.com")

哪个会给出输出:

"prov=5d4fda78-d042-2ee9-9a85-f507df184094"

while:

new_instance.check_domain("google.com")

输出:

"No cookies for google.com in this jar!"

然后,如果你(如果需要的话)微调正则表达式&amp;创建一个url列表,你可以首先遍历列表来创建许多实例并将它们保存在例如列表或dict中。在第二个循环中,您可以检查另一个URL列表,以查看其cookie是否可能出现在任何实例中。

OLD ANSWER

您链接的文档解释:

  

项目()

     

类似Dict的items(),它返回一个name-value列表   罐子里的元组。允许客户端代码调用   dict(RequestsCookieJar)并获得一个关键值的vanilla python dict   对

我认为你在寻找的是:

#!/usr/bin/env python3
# -*- coding: UTF-8 -*-

import requests

def getCookies(url):

    r = requests.get(url)

    print("\"prov=" + str(dict(r.cookies)["prov"]) + "\"")

现在我可以像这样运行它:

>>> getCookies("https://stackoverflow.com")
"prov=f7712c78-b489-ee5f-5e8f-93c85ca06475"

答案 2 :(得分:1)

我认为您需要使用Cookie的Python字典。 (见我的评论above。)

def getCookies(cookie_jar, domain):
    cookie_dict = cookie_jar.get_dict(domain=domain)
    found = ['%s=%s' % (name, value) for (name, value) in cookie_dict.items()]
    return ';'.join(found)

你的例子:

>>> r = requests.get("https://stackoverflow.com")
>>> getCookies(r.cookies, ".stackoverflow.com")
"prov=4df137f9-848e-01c3-f01b-35ec61022540"

答案 3 :(得分:0)

下面的代码不保证是“前向兼容的”,因为我正在访问由其作者故意隐藏(这类)的类的属性;但是,如果您必须了解Cookie的属性,请在此处查看:

import http.cookies
import requests
import json
import sys
import os

aresponse = requests.get('https://www.att.com')
requestscookiejar = aresponse.cookies
for cdomain,cooks in requestscookiejar._cookies.items():
    for cpath, cookgrp in cooks.items():
        for cname,cattribs in cookgrp.items():
            print(cattribs.version)
            print(cattribs.name)
            print(cattribs.value)
            print(cattribs.port)
            print(cattribs.port_specified)
            print(cattribs.domain)
            print(cattribs.domain_specified)
            print(cattribs.domain_initial_dot)
            print(cattribs.path)
            print(cattribs.path_specified)
            print(cattribs.secure)
            print(cattribs.expires)
            print(cattribs.discard)
            print(cattribs.comment)
            print(cattribs.comment_url)
            print(cattribs.rfc2109)
            print(cattribs._rest)

当一个人需要访问cookie的简单属性时,遵循以下方法可能不太复杂。这样可以避免使用RequestsCookieJar。在这里,我们通过读取响应对象的headers属性而不是cookies属性来构造一个SimpleCookie实例。名称SimpleCookie似乎暗示单个cookie,但这不是简单的cookie。试试看:

import http.cookies
import requests
import json
import sys
import os

def parse_cookies(http_response):
    cookie_grp = http.cookies.SimpleCookie()
    for h,v in http_response.headers.items():
        if 'set-cookie' in h.lower():
            for cook in v.split(','):
                cookie_grp.load(cook)
    return cookie_grp

aresponse = requests.get('https://www.att.com')
cookies = parse_cookies(aresponse)
print(str(cookies))