我正在开发一种服务,可以收听网址,访问每个网址,并为每个请求获取Cookie。
目前我有这样的事情:
browser = webdriver.Firefox()
browser.get('http://google.com')
cookies = browser.get_cookies()
#parse cookies
但是,这只给我第一方Cookie,但我还需要获得第三方Cookie。我发现Selenium网站驱动程序不支持此功能。我想知道如何实现这一目标?我不仅限于Selenium,所以会欣赏其他解决方案。
答案 0 :(得分:4)
import requests
s = requests.session()
r = s.get('http://google.com')
r = s.get('http://facebook.net')
for cookie in s.cookies:
print(cookie)
使用:Selenium + PhantomJS
from selenium import webdriver
cookie_file_path = 'cookie.txt'
args = ['--cookies-file={}'.format(cookie_file_path)]
driver = webdriver.PhantomJS(service_args=args)
driver.get('http://google.com')
driver.get('http://facebook.com')
with open(cookie_file_path) as f:
print(f.read())
输出(包裹):
[General]
cookies="@Variant(\0\0\0\x7f\0\0\0\x16QList<QNetworkCookie>\0\0\0\0\x1\0\0\0\a\0
\0\0\xd6NID=67=SZetUV-oLq_M8ik40VT2GEIb45LMaXkhm6H3zx1wULO52qkCHPc9AML_p5eubW4zL
Ms158YAYKQTdCJzb4mInix_Zek6P8Ej1XZh9h5Ng3I7X4gZuE_S-Fl2YpaSYd9B; HttpOnly; expir
es=Wed, 18-Dec-2013 02:44:31 GMT; domain=.google.co.kr; path=/\0\0\0ldatr=kMm_Ue
0P06lxFANs8c-wCgwG; HttpOnly; expires=Thu, 18-Jun-2015 02:44:32 GMT; domain=.fac
ebook.com; path=/\0\0\0Kreg_fb_gate=https%3A%2F%2Fwww.facebook.com%2F; domain=.f
acebook.com; path=/\0\0\0Jreg_fb_ref=https%3A%2F%2Fwww.facebook.com%2F; domain=.
facebook.com; path=/\0\0\0\xa2PREF=ID=be651672f1ddac52:U=515e3545a8a53080:FF=0:T
M=1371523471:LM=1371524047:S=iqfF3qNRUwVsInZR; expires=Thu, 18-Jun-2015 02:54:07
GMT; domain=.google.com; path=/\0\0\0\xd4NID=67=pm8Ws9703eugHhhImX_hBpqhUyAhCUG
TebjDZ6YY_cP7CuvIA4x8ElgGaj6tOweXFxxjALoX1PwqFvHHkUY1kerw3vwM-VaIyyPVSADMqOnR-Ty
ed_bGU3bk6YSwUUeG; HttpOnly; expires=Wed, 18-Dec-2013 02:54:07 GMT; domain=.goog
le.com; path=/\0\0\0\xa9PREF=ID=9769c9a2d96728cf:U=3d59c2548337b74e:FF=0:NW=1:TM
=1371523471:LM=1371524047:S=vE5Y_06LhP4unse7; expires=Thu, 18-Jun-2015 02:54:07
GMT; domain=.google.co.kr; path=/)"