我是网络爬虫的新手,并且在这个网站上遇到了超过一个星期的麻烦。
我想要的是用州和城市信息填写表格,然后该站点将向我返回一张表格。我想要这些表的信息
我不知道在'browser ['???????'] ='????'中输入什么获取表单。我已经尝试过'option','#estadoSAGIUFMU','form,#estadoSAGIUFMU',但它们都没有起作用。
这是可以提交的州数据:
<table>
<tr>
<td>
<select name='uf_ibge' style='width: 240px' id='estadoSAGIUFMU' onchange="document.getElementById('nome_estadoSAGIUFMU').value=this.options[this.selectedIndex].text;ajaxSAGIUF('selector_municipiosSAGIUF','p_ibge='+this.value);document.getElementById('selector_municipiosSAGIUF').innerHTML = '';">
set names 'utf8';select uf, n_uf as estado, id_uf as ibge from mapas.shp_uf order by uf
<option value=''>Selecione um estado</option>
<option value='12'>AC - Acre</option>
<option value='27'>AL - Alagoas</option>
<option value='13'>AM - Amazonas</option>
<option value='16'>AP - Amapá</option>
<option value='29'>BA - Bahia</option>
<option value='23'>CE - Ceará</option>
<option value='53'>DF - Distrito Federal</option>
<option value='32'>ES - Espírito Santo</option>
<option value='52'>GO - Goiás</option>
<option value='21'>MA - Maranhão</option>
<option value='31'>MG - Minas Gerais</option>
<option value='50'>MS - Mato Grosso do Sul</option>
<option value='51'>MT - Mato Grosso</option>
<option value='15'>PA - Pará</option>
<option value='25'>PB - Paraíba</option>
<option value='26'>PE - Pernambuco</option>
<option value='22'>PI - Piauí</option>
<option value='41'>PR - Paraná</option>
<option value='33'>RJ - Rio de Janeiro</option>
<option value='24'>RN - Rio Grande do Norte</option>
<option value='11'>RO - Rondônia</option>
<option value='14'>RR - Roraima</option>
<option value='43'>RS - Rio Grande do Sul</option>
<option value='42'>SC - Santa Catarina</option>
<option value='28'>SE - Sergipe</option>
<option value='35'>SP - São Paulo</option>
<option value='17'>TO - Tocantins</option>
</select>
<input name='nome_estado' id='nome_estadoSAGIUFMU' type='hidden' value=''>
</td>
<td>
<div id='selector_municipiosSAGIUF'>
</div>
</td>
</tr>
</table>
这是我到目前为止编写的代码
from mechanize import Browser
import lxml.html as lh
browser = Browser()
browser.open('http://aplicacoes.mds.gov.br/sagirmps/estrutura_fisica/preenchimento_municipio_cras_new1.php')
browser.select_form(nr=0)
browser['????'] = '????'
我也尝试过:
import requests
url = 'http://aplicacoes.mds.gov.br/sagirmps/estrutura_fisica/preenchimento_municipio_cras_new1.php'
r = requests.post(url, data = {'Selecione um estado':'SP - São Paulo', 'Selecione um município': 'Bauru'})
r.text
但是当我使用r.text时,它显然会返回整个html代码