mechanize:为什么我的表单列表只包含1个元素?

时间:2014-04-04 10:49:52

标签: python forms mechanize

我是机械化的新手,也不是最先进的python用户,但我想自动化一项任务,我想在其中为网页提供输入。问题是现在,"提交"按钮没有分配控件名称。所以我研究了一下,找到了一种方法来为有问题的表单设置一个值。但要做到这一点,我必须访问我想要赋值的特定表单。所以我的代码看起来像这样:

forms = [f for f in br.forms()]
print forms[0].controls[0].name

我只是认为我可以通过写forms[x]来访问表单,然后是:

forms[x].set_value("VALUE", 
                       nr=5)

我得到的错误是:

    forms[54].set_value("VALUE",nr=100)
IndexError: list index out of range

这可能是一个愚蠢的问题,可能是因为我并不真正理解我使用的功能,但由于没有真正的纪录片,我真的很感激帮助句子这里。

PS:我可以使用

打印所有表格
for f in br.forms():
    print f

带输出:

 <CheckboxControl(lookup=[yes])>
  <TextControl(fld=NoName)>
  <TextControl(pixemail=)>
  <IgnoreControl(<None>=<None>)>
  <TextControl(ra=00 00 00.0)>
  <TextControl(dec=00 00 00.0)>
  <SelectControl(equinox=[*J2000.0, B1950.0])>
  <TextControl(offra=0.0)>
  <TextControl(offdec=0.0)>
  <TextControl(epoch=2000.0)>
  <SubmitControl(<None>= Retrieve Data ) (readonly)>
  <RadioControl(cextract=[*rect, circle])>
  <TextControl(rawid=10.0)>
  <TextControl(decwid=10.0)>
  <SelectControl(wunits=[Degrees, *Minutes, Seconds])>
  <TextControl(cirrad=10.0)>
  <SelectControl(cat=[UCAC 2, UCAC 3, NOMAD, *USNO B1.0, USNO A2.0, ACT])>
  <SelectControl(surims=[None, *All Surveys, POSS-I (103aO, 103aE), POSS-II (IIIaJ, IIIaF, IV-N), SOUTH, AAO-R, POSS-IO, POSS-IE, POSS-IIJ, POSS-IIF, POSS-IIN, SRC-J, SERC-EJ, ESO-R, SERC-ER])>
  <CheckboxControl(getcat=[*yes])>
  <CheckboxControl(getfin=[*yes])>
  <CheckboxControl(pixflg=[yes])>
  <CheckboxControl(colbits=[All, *cb_id, *cb_altid, *cb_ra, *cb_sigra, cb_mep, *cb_mura, cb_muprob, *cb_smura, cb_sfitra, *cb_fitpts, cb_err, *cb_flg, *cb_mag, cb_smag, *cb_mflg, *cb_fldid, *cb_sg, cb_xres, cb_pltidx, *cb_xi, *cb_dstctr, *cb_gall])>
  <RadioControl(skey=[*ra, dec, sigra, sigdec, mep, mura, mudec, muprob, smura, smudec, sfitra, sfitdec, fitpts, err, flg, mag, smag, mflg, fldid, sg, xres, yres, pltidx, clr, sigpos, mutot, sigmu, xi, eta, dstctr, gall, galb])>
  <SelectControl(slf=[*hh/dd mm ss, hh/dd:mm:ss, hh.hhh/dd.ddd, ddd.ddd/dd.ddd])>
  <TextControl(minnpts=0)>
  <TextControl(maxnpts=10)>
  <SelectControl(clr=[B1, R1, B2, *R2, I2, B, V, R, J, H, K])>
  <TextControl(bri=0)>
  <TextControl(fai=100)>
  <SelectControl(clr0m1A=[B1, R1, *B2, R2, I2, B, V, R, J, H, K])>
  <SelectControl(clr0m1B=[B1, R1, B2, *R2, I2, B, V, R, J, H, K])>
  <TextControl(bmrmin=-100)>
  <TextControl(bmrmax=100)>
  <TextControl(minposnerr=0.0)>
  <TextControl(maxposnerr=10000.0)>
  <TextControl(mumin=0.0)>
  <TextControl(mumax=10000.0)>
  <TextControl(minmuerr=0.0)>
  <TextControl(maxmuerr=10000.0)>
  <TextControl(minsep=0.0)>
  <HiddenControl(minmagerr=0.0) (readonly)>
  <HiddenControl(maxmagerr=1.0) (readonly)>
  <SelectControl(opstars=[Yes, *No])>
  <SelectControl(whorbl=[Light Stars/Dark Sky, *Dark Stars/Light Sky])>
  <SelectControl(pixgraph=[Progressive JPEG, *JPEG, GIF, PDF, Large JPEG (1 Survey Only), Large GIF (1 Survey Only), PS (1 Survey Only)])>
  <SelectControl(pixfits=[Yes, *No])>
  <SelectControl(ori=[NE - North Up, East Right, *NW - North Up, East Left, SE - North Down, East Right, SW - North Down, East Left, EN - East Up, North Right, ES - East Up, North Left, WN - East Down, North Right, WS - East Down, North Left])>
  <SelectControl(tck=[N and E marks, *Tick Marks, Grid Lines])>
  <SelectControl(starlbl=[Yes, *No])>
  <SelectControl(cmrk=[*None, 5.0 sec Box, 10.0 sec Box, 30.0 sec Box, 1.0 min Box, 2.0 min Box, 5.0 min Box, 10.0 min Box, 5.0 sec Circle, 10.0 sec Circle, 30.0 sec Circle, 1.0 min Circle, 2.0 min Circle, 5.0 min Circle, 10.0 min Circle])>
  <TextControl(aobj=none)>
  <SelectControl(pcl=[*P - Points, L - Points + Labels, C - Connected Points, A - Connected Points + Labels])>
  <TextareaControl(atbl=  )>
  <IgnoreControl(<None>=<None>)>
  <SubmitControl(<None>= Retrieve Data ) (readonly)>
  <SelectControl(gzf=[*Yes, No])>
  <SelectControl(cftype=[*ASCII, XML/VO])>>

我想要接触的是<SubmitControl(<None>=Retrieve Data ) (readonly)>,这是从下到上计算的第三个。

3 个答案:

答案 0 :(得分:0)

我很确定有更好的方法可以做到这一点 - 但我对机械化并不熟悉。你可以做类似的事情:

submit_values = filter(lambda x: 'SubmitControl' in str(x), br.forms())
if submit_values:
    print(submit_values[0])

如果有多个,显然你会得到多个。这可能是 最奇怪的方式来实现你想做的任何事情。此外,假设此表单相当静态,您可以使用requests替换Mechanize的使用。然后它看起来像这样:

import requests

r = requests.post("http://form/action/url/goes/here", data={"lookup": "yes",
                                                            # all the other elements
                                                            })
print(r.status_code)
print(r.text)
# Do something else with r.text, e.g. scrape values with beautifulsoup or something

答案 1 :(得分:0)

试试这个。您可以按控件类型搜索没有名称的控件。只是从记忆中走出来:

br.form.find_control(type='submit', nr=1)

我认为这是正确的语法..我会仔细检查并确定。

答案 2 :(得分:0)

试试这个:

import mechanize

br = mechanize.Browser()

# Insert the desired URL here
br.open('http://www.nofs.navy.mil/data/fchpix/cfch.html#fchmenu')
br.select_form(nr=0)

br["ra"] = "input 1"
br["dec"] = "input 2"
br["pixfits"] = ["Yes"]

br.find_control("pixflg").items[0].selected=True

response = br.submit()