无法使用Mechanize使用表单登录

时间:2011-02-18 17:02:03

标签: ruby-on-rails ruby screen-scraping nokogiri mechanize

我正在尝试使用机械化提交表单。但提交时没有任何反应。我刚刚再次登录页面。

表格:

http://affilate.mikkelsenmedia.dk/partnersystem/mylogins.php

require 'Mechanize'
agent = WWW::Mechanize.new
agent.get("http://affilate.mikkelsenmedia.dk/partnersystem/mylogins.php")

form = agent.page.forms.first
form.username = 'username'
form.password = 'password'
form.submit

使用浏览器登录时的实时http日志:

http://affilate.mikkelsenmedia.dk/partnersystem/mylogins.php

POST /partnersystem/mylogins.php HTTP/1.1
Host: affilate.mikkelsenmedia.dk
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; da; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: da,en-us;q=0.7,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: http://affilate.mikkelsenmedia.dk/partnersystem/mylogins.php
Cookie: XXX
Content-Type: application/x-www-form-urlencoded
Content-Length: 47
username=username&password=password&send=Submit
HTTP/1.1 200 OK
Date: Fri, 18 Feb 2011 17:07:15 GMT
Server: Apache/2.0.63 (CentOS)
X-Powered-By: PHP/5.1.6
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Content-Length: 77
Connection: close
Content-Type: text/html; charset=UTF-8

控制台:

irb(main):001:0> require 'Mechanize'
=> true
irb(main):002:0> agent = Mechanize.new { |agent|
irb(main):003:1*   agent.user_agent_alias = 'Mac Safari'
<"http://affilate.mikkelsenmedia.dk/partnersystem/mylogins.php")
=> #<Mechanize::Page
 {url
  #<URI::HTTP:0x2a1c770 URL:http://affilate.mikkelsenmedia.dk/partnersystem/mylo
gins.php>}
 {meta}
 {title "Partner System - Log-in"}
 {iframes}
 {frames}
 {links}
 {forms
  #<Mechanize::Form
   {name "loginform"}
   {method "POST"}
   {action "mylogins.php"}
   {fields
    #<Mechanize::Form::Text:0x2836680
     @name="username",
     @node=
      #(Element:0x141b3e8 {
        name = "input",
        attributes = [
          #(Attr:0x13e4458 { name = "type", value = "text" }),
          #(Attr:0x13e4440 { name = "name", value = "username" }),
          #(Attr:0x13e43e0 { name = "size", value = "30" })]
        }),
     @value="">
    #<Mechanize::Form::Field:0x2836230
     @name="password",
     @node=
      #(Element:0x141b22c {
        name = "input",
        attributes = [
          #(Attr:0x13ac148 { name = "type", value = "password" }),
          #(Attr:0x13ac13c { name = "name", value = "password" }),
          #(Attr:0x13ac10c { name = "size", value = "30" })]
        }),
     @value="">}
   {radiobuttons}
   {checkboxes}
   {file_uploads}
   {buttons
    #<Mechanize::Form::Submit:0x2835f90
     @name="send",
     @node=
      #(Element:0x141b01c {
        name = "input",
        attributes = [
          #(Attr:0x13702e0 { name = "type", value = "submit" }),
          #(Attr:0x13702d4 { name = "name", value = "send" }),
          #(Attr:0x13702c8 { name = "class", value = "style2" }),
          #(Attr:0x13702bc { name = "value", value = "Submit" })]
        }),
     @value="Submit">}>}>

irb(main):006:0>   form = agent.page.forms.first
irb(main):007:0>   form.username = 'username'
=> "username"
irb(main):008:0>   form.password = 'password'
=> "password"
irb(main):009:0>   form.submit
=> #<Mechanize::Page
 {url
  #<URI::HTTP:0x2a82e78 URL:http://affilate.mikkelsenmedia.dk/partnersystem/mylo
gins.php>}
 {meta}
 {title "Partner System - Log-in"}
 {iframes}
 {frames}
 {links}
 {forms
  #<Mechanize::Form
   {name "loginform"}
   {method "POST"}
   {action "mylogins.php"}
   {fields
    #<Mechanize::Form::Text:0x2a52c50
     @name="username",
     @node=
      #(Element:0x1529694 {
        name = "input",
        attributes = [
          #(Attr:0x1513c14 { name = "type", value = "text" }),
          #(Attr:0x1513c08 { name = "name", value = "username" }),
          #(Attr:0x1513bfc { name = "size", value = "30" })]
        }),
     @value="">
    #<Mechanize::Form::Field:0x2a52998
     @name="password",
     @node=
      #(Element:0x1529550 {
        name = "input",
        attributes = [
          #(Attr:0x15121d4 { name = "type", value = "password" }),
          #(Attr:0x15121c8 { name = "name", value = "password" }),
          #(Attr:0x15121bc { name = "size", value = "30" })]
        }),
     @value="">}
   {radiobuttons}
   {checkboxes}
   {file_uploads}
   {buttons
    #<Mechanize::Form::Submit:0x2a52758
     @name="send",
     @node=
      #(Element:0x152940c {
        name = "input",
        attributes = [
          #(Attr:0x151062c { name = "type", value = "submit" }),
          #(Attr:0x1510614 { name = "name", value = "send" }),
          #(Attr:0x1510608 { name = "class", value = "style2" }),
          #(Attr:0x15105fc { name = "value", value = "Submit" })]
        }),
     @value="Submit">}>}>

irb(main):010:0>

3 个答案:

答案 0 :(得分:11)

服务器还检查是否存在提交按钮参数'send'。

form.submit

之前添加此行
form.add_field! 'send','Submit'

答案 1 :(得分:2)

我看到了一个运行你的例子的事情:

>> agent = WWW::Mechanize.new
!!!!! DEPRECATION NOTICE !!!!!
The WWW constant is deprecated, please switch to the new top-level Mechanize
constant.  WWW will be removed in Mechanize version 2.0

You've referenced the WWW constant from (irb):3:in `irb_binding', please
switch the "WWW" to "Mechanize".  Thanks!

Sincerely,

  Pew Pew Pew

我在该系统上没有帐户,因此登录对我不起作用。如果您在IRB中执行初始步骤,您会看到什么?

答案 2 :(得分:0)

在运行“agent = WWW :: Mechanize.new”时,您将获得DEPRECATION警告,因为最新版本中不推荐使用“WWW”常量。您需要在没有'WWW'常量的情况下运行命令。更改的命令将是:

  
    

agent = Mechanize.new

  

由于