我正在尝试从网站下载文件但在页面源中没有与按钮或表单相关的标签。他们正在使用自定义按钮。页面源类似于
var LABELS={"kind":"Template type","format":"File format","btn_generate":"Generate template","btn_cancel":"Cancel","btn_start":"Start import","btn_back":"Back to overview","legend":"Mandatory fields are <span class='marked'>blue</span>"};
从上面的代码"btn_generate":"Generate template"
这是我的按钮。
通常,如果我点击网页上的按钮,它会下载一个文件。
我正在解析页面源代码并将"btn_generate"
的值设为"Generate template"
。
现在我想在此"btn_generate":"Generate template"
上应用按钮点击下载文件。因为,不确定是否可能。
我的代码:
#!/usr/bin/python
import mechanize
from cookies import *
myCookie = Cookiee().get_cookies()
req = six.moves.urllib.request.Request(
url='https://xxxx.xxxxxxx.xxx/xxx-xxx/import?PAGE=1',
headers={
'Cookie': '; '.join([k+'='+v for k, v in myCookie.items()])
}
)
response = six.moves.urllib.request.urlopen(req)
text = bytes(response.read()).decode('utf-8', errors="replace")
javascript = re.findall(
r'<script language=\'javascript\'>(.+?)</script>',
text,
flags=re.S
)[0]
jvars = javascript.split('\nvar ')
labels = [x for x in jvars if x.startswith("LABELS")][0]
jsonoverview = json.loads(labels.split('=', 1)[1].rstrip(';\n'))
btn_name = jsonoverview['btn_generate']
print('btn_name',btn_name)
br = mechanize.Browser()
HTML页面:
<div id="contentWrapper"><table id="pageTitleBox"><tbody><tr><td><div id="backBtn" class="goog-custom-button goog-inline-block" role="button" tabindex="0" title="Back to overview" style="-webkit-user-select: none;"><div class="goog-inline-block goog-custom-button-outer-box"><div class="goog-inline-block goog-custom-button-inner-box"><div class="back-icon goog-inline-block"></div><span></span></div></div></div></td><td><div id="pageTitle">Import template</div></td></tr></tbody></table><table id="topControlBox"><tbody><tr><td class="rlabel dropBox">Template type:</td><td class="dropBox"><div id="isNewTicketDropDownBox"><div class="goog-inline-block goog-menu-button" title="" role="button" tabindex="0" aria-haspopup="true" style="-webkit-user-select: none;"><div class="goog-inline-block goog-menu-button-outer-box"><div class="goog-inline-block goog-menu-button-inner-box"><div class="goog-inline-block goog-menu-button-caption">for updating</div><div class="goog-inline-block goog-menu-button-dropdown"> </div></div></div></div></div></td><td class="rlabel dropBox">File format:</td><td class="dropBox"><div id="formatDropDownBox"><div class="goog-inline-block goog-menu-button" title="" role="button" tabindex="0" aria-haspopup="true" style="-webkit-user-select: none;"><div class="goog-inline-block goog-menu-button-outer-box"><div class="goog-inline-block goog-menu-button-inner-box"><div class="goog-inline-block goog-menu-button-caption">Excel</div><div class="goog-inline-block goog-menu-button-dropdown"> </div></div></div></div></div></td><td><div id="startBtn" class="goog-custom-button goog-inline-block" role="button" tabindex="0" title="Start import" style="-webkit-user-select: none;"><div class="goog-inline-block goog-custom-button-outer-box"><div class="goog-inline-block goog-custom-button-inner-box"><div class="action-icon goog-inline-block"></div><span>Start import</span></div></div></div></td></tr></tbody></table><div class="btnPanel"><table><tbody><tr><td><div id="generateBtn" class="goog-custom-button goog-inline-block" role="button" tabindex="0" title="Generate template" style="-webkit-user-select: none;"><div class="goog-inline-block goog-custom-button-outer-box"><div class="goog-inline-block goog-custom-button-inner-box"><div class="save-icon goog-inline-block"></div><span>Generate template</span></div></div></div></td></tr></tbody></table></div></div>