Python twill:可通过PHP脚本访问下载文件

时间:2016-06-19 18:58:10

标签: php python twill

我使用twill在受登录表单保护的网站上导航。

from twill.commands import *

go('http://www.example.com/login/index.php') 
fv("login_form", "identifiant", "login")
fv("login_form", "password", "pass")
formaction("login_form", "http://www.example.com/login/control.php")
submit()
go('http://www.example.com/accueil/index.php')

在最后一页上,我想下载一个可通过div访问的Excel文件,其中包含以下属性:

onclick="OpenWindowFull('../util/exports/control.php?action=export','export',200,100);"

使用twill我可以访问PHP脚本的URL并显示文件的内容。

go('http://www.example.com/util/exports/control.php?action=export')
show()

然而,返回对应于原始内容的字符串:因此不可用。有没有办法以类似urllib.urlretrieve()的方式直接检索Excel文件?

2 个答案:

答案 0 :(得分:1)

我设法将cookie从twill发送到requests

Nota:由于登录时错综复杂的控制(无法确定正确的标题或其他选项),我无法使用requests

import requests
from twill.commands import *

# showing login form with twill
go('http://www.example.com/login/index.php') 
showforms()

# posting login form with twill
fv("login_form", "identifiant", "login")
fv("login_form", "password", "pass")
formaction("login_form", "http://www.example.com/login/control.php")
submit()

# getting binary content with requests using twill cookie jar
cookies = requests.utils.dict_from_cookiejar(get_browser()._session.cookies)
url = 'http://www.example.com/util/exports/control.php?action=export'

with open('out.xls', 'wb') as handle:
    response = requests.get(url, stream=True, cookies=cookies)

    if not response.ok:
        raise Exception('Could not get file from ' + url)

    for block in response.iter_content(1024):
        handle.write(block)

答案 1 :(得分:0)

另一种使用twill.commands.save_html修改后写作' wb'而不是':Python 2.7 using twill, saving downloaded file properly