机械化python脚本

时间:2016-09-12 00:15:32

标签: python html forms mechanize hidden

首先让我为我的全新道歉。几年前我的一位朋友问过我是否可以编写一个程序来自动获取替代教学开口。它不是我所知道的任何领域,但是一些教程让我能够在不知道html的情况下解决一些问题(而且对于这个问题不仅仅是关于Python的一点点)。从那时起,脚本运行良好,但今年他们的网站似乎已经重做并破坏了东西,推动它远远超出了我的理解。

我之前的代码有效:

# Create a Browser instance
b = mechanize.Browser()
# Load the page
b.open(loginURL)

# Select the form
b.select_form(nr=0)

# Fill out the form
b['id'] = 'XXXXXXXXXX'   # Note: I edited out my friend's login info here for privacy
b['pin'] = 'XXXX'

b.submit();

仍然只有一种形式,但控件现在的类型为"隐藏"而不是我直接需要的那些。当我用开发者模式检查它时,我可以看到html中的旧字段,并且名称相同,但我无法弄清楚(尝试了一些不起作用的事情)我现在如何访问它们。这是html:



<form id="loginform" name="loginform" method="post" action="https://www.aesoponline.com/login.asp?x=x&amp;&amp;pswd=&amp;sso=">

  <input type="hidden" name="location" value="">
  <input type="hidden" name="qstring" value="">
  <input type="hidden" name="absr_ID" value="">
  <input type="hidden" name="foil" value="">


  <div style="margin: auto; text-align:center;">
    <div id="loginContainer" style="text-align: left;">
      <div id="loginContent">
        <div id="Div1" style="position:relative; left:65px;" class="hide-me-for-rebranding">
          <a href="http://www.frontlinetechnologies.com">
            <img src="images/frontlinelogo.png" border="0">
          </a>
        </div>
        <div id="loginLoginBox" style="position:relative;">
          <div id="loginAesopLogo" style="padding-bottom:0px;" class="hide-me-for-rebranding"></div>
          <!--endloginAesopLogo-->
          <div id="loginLoginFields" style="margin-top:0px;">
            <br>

            <table>
              <tbody>
                <tr height="25px">
                  <td width="30px"><span class="corrLoginFormText">ID:</span>
                  </td>
                  <td>
                    <input type="text" class="loginFormText" maxlength="80" id="txtLoginID" name="id" value="">
                  </td>
                </tr>
                <tr height="25px">
                  <td width="30px"><span class="corrLoginFormText">Pin:</span>
                  </td>
                  <td>
                    <input type="password" class="loginFormText" maxlength="20" id="txtPassword" name="pin">
                  </td>
                </tr>
              </tbody>
            </table>

            <table>
              <tbody>
                <tr height="30px">
                  <td width="75px" valign="top">
                    <a class="textButton" id="loginLink" name="loginLink" href="#"><span style="white-space:nowrap;">Login</span></a> 
                    <input type="hidden" id="submitLogin" name="submitLogin" value="1">
                  </td>
                  <td>

                    <div id="loginhelp" style="float:right;">
                      <img src="images/icon.pinreminder.png" alt="pin" width="10" height="15" align="top"><a href="forgot_pin.asp">Pin Reminder</a>
                      <br>
                      <img src="images/icon.loginproblems.png" alt="login" width="11" height="17" align="top"> <a href="http://help.frontlinek12.com/Employee/Docs/ClientServicesHelpGuide-LoginProblems.pdf">Login Problems</a>
                    </div>

                  </td>
                </tr>
              </tbody>
            </table>
          </div>
          <!--endloginLoginFields-->
          <div id="errorLabel" style="position: absolute; top: 170px; left:5px;margin:0px;"><span class="assistanceText"></span>
          </div>
        </div>
        <!--endloginLoginBox-->
        <div id="loginContentText">
          <span class="loginContentHeader">Welcome To Absence Management</span>
          <br>
          <span class="loginContentText">
        				You are about to enter Frontline Absence Management!<br> Please enter your ID and PIN to login to your account, or click the button below to learn more about Frontline's growing impact on education.</span> 
          <br>
          <a class="textButton" href="http://www.frontlinek12.com/Products/Aesop.html"><span>Learn More</span></a>

        </div>
        <!--endloginContentText-->

      </div>
      <!--endLoginContent-->
      <div id="loginFooterShading" class="hide-me-for-rebranding">
        <div id="loginFooterLeft"></div>
        <div id="loginFooterRight"></div>
      </div>
      <!--endloginFooterShading-->
      <div id="loginFooter" style="text-align:center;width:725px;">
        <a href="http://www.frontlinetechnologies.com/Privacy_Policy.html" style="color: rgb(153, 0, 0) ; font-size:9px;" target="_blank">Privacy Policy</a>
        <br>© Frontline Technologies Group LLC &lt;
        <parm1>&gt;
          <br>All rights reserved. Protected under US Patents 6,334,133, 6,675,151, 7,430,519, 7,945,468 and 8,140,366 with additional patents pending.
        </parm1>
      </div>
      <!--endloginReflections-->
    </div>
    <!--endLoginContainer-->


  </div>
  <!--end margin div -->
  <!-- MODAL DIALOG -->
  <div id="basicModalContent" style="display:none">
    <span class="assistanceText"></span>
  </div>
</form>
&#13;
&#13;
&#13;

非常感谢任何帮助。非常感谢你。

2 个答案:

答案 0 :(得分:0)

如果该HTML代码正是该页面上的内容,请尝试这样的操作。当你输入b.select_form(nr = 0)时,由于某种原因,第一种形式可能不是你所选择的。通过在b.select_form()中查找表单名称,您可以确保找到正确的表单。测试一下,看看它是否有效。

(* 1267650600228229401496703205376 316912650057057350374175801344)

答案 1 :(得分:0)

试试这个:

b = mechanize.Browser()
b.set_handle_equiv(False)
b.set_handle_robots(False)
b.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64; rv:18.0)Gecko/20100101 Firefox/18.0 (compatible;)'),('Accept', '*/*')]
b.open(loginURL).read()

b.select_form(nr=0)

b['id'] = 'XXX'
b['pin'] = 'XXX'

resp = b.submit()
print resp.read()

对我有用!