如何使用python mechanize以html格式访问表中的行

时间:2015-03-13 11:26:35

标签: python html python-2.7 mechanize-python

我试图以html格式访问行,表单中的html代码就是这样

</script> 

<form name="calendarForm" method="post" action="/ibook/publicLogin.do" onsubmit="return validateForm(this);"><div><input type="hidden" name="org.apache.struts.taglib.html.TOKEN" value="7a6aa28270cc38601c894a05d01b7264"></div> 

   <input type="hidden" name="apptDetails.apptType" value="PRAP"> 

   <table border="0" cellspacing="1" cellpadding="0" align="center"> 
       <tr> 
         <td nowrap="nowrap" width="20%"><div id=id_div1 style="display:''"><FONT color=#ff0000>*</FONT><label for="apptDetails.identifier1">Sponsor's NRIC /<br />&nbsp;&nbsp;Applicant's FIN</label></div></td> 
         <td nowrap="nowrap" width="5%">:</td> 
         <td nowrap="nowrap" width="75%"><div id=id_div3 style="display:''"><input type="text" name="apptDetails.identifier1" maxlength="9" size="15" value="" onblur="javascript:this.value=this.value.toUpperCase();" id="apptDetails.identifier1" style="text-transform: uppercase;" class="txtFill_singleLine"></div></td> 
       </tr> 

要将信息添加到名为name="apptDetails.identifier1" 的行,如何向行输入值?我似乎无法使用传统的python机械化表单选项访问html行,请提示

这是我的代码

&#13;
&#13;
import cookielib 
import urllib2 
import mechanize 

# Browser 
br = mechanize.Browser() 

# Enable cookie support for urllib2 
cookiejar = cookielib.LWPCookieJar() 
br.set_cookiejar( cookiejar ) 

# Broser options 
br.set_handle_equiv( True ) 
br.set_handle_gzip( True ) 
br.set_handle_redirect( True ) 
br.set_handle_referer( True ) 
br.set_handle_robots( False ) 

# ?? 
br.set_handle_refresh( mechanize._http.HTTPRefreshProcessor(), max_time = 1 ) 

br.addheaders = [ ( 'User-agent', 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0' ),('Host','eappointment.ica.gov.sg'),('Accept','text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8') ] 

# authenticate 
br.open('https://eappointment.ica.gov.sg/ibook/gethome.do')

print "forms"
br.select_form(name="calendarForm")
print "forms"

# these two come from the code you posted
# where you would normally put in your username and password


br.find_control(name="apptDetails.apptType").value = ['PRAP']



res = br.submit()



html = br.response().readlines()
file = open("html.txt", "w")
for i in range(0, len(html)):
    file.write(html[i])
    file.write('\n')

file.close()    
br.close()
print "Success!\n"
&#13;
&#13;
&#13;

1 个答案:

答案 0 :(得分:0)

致电br.submit()后,您将导航到下一页。该页面上有两个表单,两个名为&#34; calendarForm&#34;:

>>> for f in br.forms():
...     print f
...     print
... 
<calendarForm POST https://eappointment.ica.gov.sg/ibook/loginSelection.do application/x-www-form-urlencoded
  <HiddenControl(org.apache.struts.taglib.html.TOKEN=2eb104ffed4bc6ea09b67e556e5dd6e2) (readonly)>
  <SelectControl(apptDetails.apptType=[CS, CSCA, CSCR, CSIC, CSPC, CSXX, PR, *PRAP, PRCF, PRAR, PRIC, PRNN, PRTR, CSXX, VS, VSEI, VSLA, VSLT, VSSP, VSST, VSVP])>>

<calendarForm POST https://eappointment.ica.gov.sg/ibook/publicLogin.do application/x-www-form-urlencoded
  <HiddenControl(org.apache.struts.taglib.html.TOKEN=2eb104ffed4bc6ea09b67e556e5dd6e2) (readonly)>
  <HiddenControl(apptDetails.apptType=PRAP) (readonly)>
  <TextControl(apptDetails.identifier1=)>
  <TextControl(apptDetails.identifier2=)>
  <TextControl(apptDetails.identifier3=)>
  <IgnoreControl(Clear=<None>)>>

您需要选择第二种形式(索引为1),所以:

>>> br.select_form(nr=1)
>>> print br.form
<calendarForm POST https://eappointment.ica.gov.sg/ibook/publicLogin.do application/x-www-form-urlencoded
  <HiddenControl(org.apache.struts.taglib.html.TOKEN=2eb104ffed4bc6ea09b67e556e5dd6e2) (readonly)>
  <HiddenControl(apptDetails.apptType=PRAP) (readonly)>
  <TextControl(apptDetails.identifier1=)>
  <TextControl(apptDetails.identifier2=)>
  <TextControl(apptDetails.identifier3=)>
  <IgnoreControl(Clear=<None>)>>

现在您可以填写表单的字段:

>>> br.form['apptDetails.identifier1'] = '12345'
>>> br.form['apptDetails.identifier2'] = '99'
>>> br.form['apptDetails.identifier3'] = '911'
>>> print br.form
<calendarForm POST https://eappointment.ica.gov.sg/ibook/publicLogin.do application/x-www-form-urlencoded
  <HiddenControl(org.apache.struts.taglib.html.TOKEN=2eb104ffed4bc6ea09b67e556e5dd6e2) (readonly)>
  <HiddenControl(apptDetails.apptType=PRAP) (readonly)>
  <TextControl(apptDetails.identifier1=12345)>
  <TextControl(apptDetails.identifier2=99)>
  <TextControl(apptDetails.identifier3=911)>
  <IgnoreControl(Clear=<None>)>>

最后,提交并保存回复以供检查:

>>> br.submit()
>>> open('response.html', 'w').write(br.response().read())

如果您为3个标识符输入了有效值,那么您应该在下一页,无论是什么。