我目前正在尝试使用 Nokogiri 和 Mechanize 在网页上使用ruby来抓取数据。我想从下一个链接获取数据以获取投标列表: http://www.panamacompra.gob.pa/ambientepublico/AP_BusquedaAvanzada.aspx
- 按照这个程序---
这是我的代码:
require 'rubygems'
require 'mechanize'
a = Mechanize.new do |agent|
agent.user_agent_alias = 'Mac Safari'
agent.follow_meta_refresh = true
end
@url='http://www.panamacompra.gob.pa/ambientepublico/AP_BusquedaAvanzada.aspx'
@m=Mechanize.new
@payload=''
@body_page = ''
@search_string='2017-1-37-0-15-cm-011063'
@viewstate=""
def set_payload
{
'txtGSA' => '',
'ctl00$ContentPlaceHolder1$txtNumeroAdquisicion'=> '',
'ctl00$ContentPlaceHolder1$txtNombreAdquisicion' => '',
'ctl00$ContentPlaceHolder1$txtNombreDemandante' => '',
'ctl00$ContentPlaceHolder1$txtNombreDependencia' => '',
'ctl00$ContentPlaceHolder1$txtNombreProveedor' => '',
'ctl00$ContentPlaceHolder1$txtFechaDesde' => '13-02-2017',
'ctl00$ContentPlaceHolder1$txtFechaHasta' => '13-03-2017',
'ctl00$ContentPlaceHolder1$txtNombreRubro' => '',
'ctl00_ContentPlaceHolder1_ASPxPopupControl1WS' => '0:0:-1:0:0:0:0:0:;0:0:-1:0:0:0:0:0:',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidTotalPaginas' => '0',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidNumeroPagina' => '1',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidOrigen' => '0',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidTotalFilas' => '1',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidInicioAnterior' => '1',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidFinAnterior' => '1',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidBloqueInicio' => '1',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidMaxFilasPorPagina' => '20',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidMaxPaginasPorListado' => '9',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidCambioBloque' => 'False',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidMostrarEstado' => 'False',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidMostrarMensaje' => 'True',
'ctl00$ContentPlaceHolder1$ControlPaginacion$hidValoresPorDefecto' => 'True',
'ctl00$ContentPlaceHolder1$hidIdDependencia' => '-1',
'ctl00$ContentPlaceHolder1$hidNombreDependencia' => '-1',
'ctl00$ContentPlaceHolder1$hidIdOrgV' => '-1',
'ctl00$ContentPlaceHolder1$hidIdEmpresaVenta' => '-1',
'ctl00$ContentPlaceHolder1$hidIdEmpresaC' => '0',
'ctl00$ContentPlaceHolder1$hidIdOrgC' => '-1',
'ctl00$ContentPlaceHolder1$hidNombreDemandante' => '-1',
'ctl00$ContentPlaceHolder1$hidDependencia' => '-1',
'ctl00$ContentPlaceHolder1$hidIDRubro' => '-1',
'ctl00$ContentPlaceHolder1$hidRedir' => '',
'ctl00$ContentPlaceHolder1$hidRangoMaximoFecha' => '',
'ctl00$ContentPlaceHolder1$hidIDProducto' => '-1',
'ctl00$ContentPlaceHolder1$hidIDProductoNoIngresado' => '-1',
'ctl00$ContentPlaceHolder1$hidNombreProducto' => '-1',
'ctl00$ContentPlaceHolder1$hidNombreProveedor' => '-1',
'ctl00$ContentPlaceHolder1$lstUnidadCompra' => '',
'ctl00$ContentPlaceHolder1$lstEstado' => '0'
}
end
```
```
@m.get @url do |page|
page.form_with :name => "aspnetForm" do |search_form|
@viewstate = search_form.field_with(:name => "__VIEWSTATE").value
@payload=set_payload
@m.post(@url,@payload).form_with :name => "aspnetForm" do |search_form_2|
search_form_2.field_with(:name => "ctl00$ContentPlaceHolder1$txtNumeroAdquisicion").value = @search_string
submit_button = search_form_2.button_with(:id=>"ctl00_ContentPlaceHolder1_btnBuscar")
finish = search_form_2.submit(submit_button)
@body_page = finish
end
puts Nokogiri::HTML(@body_page.body)
end
end
为什么表单不执行帖子? 不带帖子信息
结果:
<td class="style1" align="left" valign="top">
<input name="ctl00$ContentPlaceHolder1$txtNumeroAdquisicion" type="text" value="2017-1-37-0-15-cm-011063" id="ctl00_ContentPlaceHolder1_txtNumeroAdquisicion">
<span class="formEjemplos2"><i>Ej.: 2008-1-027-00-08-LP-000274</i></span>
<div id="divNumLC"></div>
</td>
数据发送显示在字段上,但是招标表编号