发出回发请求python或perl

时间:2018-10-23 14:28:23

标签: python perl mechanize www-mechanize mechanicalsoup

我正在尝试向使用ASP.NET构建的网站进行回发请求,我的第一次尝试是使用Mechanize在Perl中进行,然后尝试了Python,但没有结果...我读到的可能是SSL损坏或回发本身。

我之所以这样做,是因为有一个文件我想每天下载并手动下载,这会花费很多时间,但是我无法使其自动化。

我愿意接受任何语言或建议。以下是我的perl代码(注意:我必须取消激活SSL证书,因为当它处于活动状态时,我没有收到服务器的响应):

use warnings;
use strict;
use WWW::Mechanize;
use Data::Dumper;
use Time::Piece;
use Time::Seconds;
use Try::Tiny;
use Log::Logger;
use File::Basename;
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;

sub getLoggingTime {

    my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localtime(time);
    my $nice_timestamp = sprintf ( "%04d%02d%02d %02d:%02d:%02d",
                                   $year+1900,$mon+1,$mday,$hour,$min,$sec);
    return $nice_timestamp;
}
my $dirname = dirname(__FILE__);
my $log = new Log::Logger;
$log->open_append($dirname . "/pmlcenace.log");
$log->log(getLoggingTime() . " INFO: Inicio del Script pmlcenace.pl");

# =============================
# Seteo de Variables generales
# =============================

my $maxtry = 6;
my $dest = qq(C:/Users/MyUser/Desktop/Precios/);
my $url = 'https://www.cenace.gob.mx/SIM/VISTA/REPORTES/H_CapacidadTransfer.aspx?N=263&site=&tipoArch=C&tipoUni=SIN&tipo=Diarios';
my $params = [   
        __ASYNCPOST => 'true',
        __EVENTARGUMENT => '{"commandName":"Check","index":"0:0"}',
        __EVENTTARGET => 'ctl00$ContentPlaceHolder1$treePrincipal',
        ctl00_ContentPlaceHolder1_treePrincipal_ClientState => '{"expandedNodes":[],"collapsedNodes":[],"logEntries":[],"selectedNodes":[],"checkedNodes":["0","0:0"],"scrollPosition":0}',
        __VIEWSTATE => '/w edited for space and viewing reasons',
        __VIEWSTATEGENERATOR => '955A55B8',
        __EVENTVALIDATION => '/wEdAAOCpQshmYscjMDA9x+69HkswfFVx5pEsE3np13JV2opXVEvSNmVO1vU+umjph0DtwdLoqQBBqXirK2Np+DpA6TO2lTaZh4NXJjUyfeW6oTM9g==',
        'ctl00_ContentPlaceHolder1_ListViewNodos_ClientState' => '',
        'ctl00$ContentPlaceHolder1$NotifAvisos$hiddenState' => '',
        'ctl00_ContentPlaceHolder1_NotifAvisos_XmlPanel_ClientState' => '',
        'ctl00_ContentPlaceHolder1_NotifAvisos_TitleMenu_ClientState' => '',
        'ctl00_ContentPlaceHolder1_NotifAvisos_ClientState' => '',
        'ctl00$ContentPlaceHolder1$btnCerrarPanel' => '',
        'ctl00$ContentPlaceHolder1$toolkit' => '{"expandedNodes":[],"collapsedNodes":[],"logEntries":[],"selectedNodes":[],"checkedNodes":["0","0:0"],"scrollPosition":0}'
        ,''=>''];

# ===========================================
# Configuración e inizialización de Mechanize
# ===========================================

my $mech = WWW::Mechanize->new();
$mech->show_progress(1);
$mech->max_redirect(0);
$mech->agent_alias('Windows Mozilla');
#$mech->add_header(
#    User_Agent => 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
#    );

# ==========================================
# POST Request a la página del CENACE
# Nota: Intenta hasta 5 veces, si no exit: 1
# ==========================================


while ($maxtry > 0) {
    $log->log(getLoggingTime() . " INFO: POST Request: " . $url);
    $mech->post($url,$params);  

    if ($mech->status == '200') {
        last;
    } else {
        $maxtry--;
        $log->log(getLoggingTime() . " WARN: POST Request fallido. [" . $maxtry . "] Intentos restantes");
        if ($maxtry == 1) {
            $log->fail(getLoggingTime() . " ERROR: Numero máximo de intentos alcanzados");
            exit 1;
        }
        sleep(1.5);
    }
}




print "\n\n\n\n\n Validating: AFTER DOING THE POST:\n";
print $mech->content(  base_href => [my $base_href|undef] );
if ($mech->is_html() == 1) 
    {
        print "\n";
        print "It DOES have HTML";
        print "\n";
    }
    else 
    {
        print "\n";
        print "Do NOT have HTML";
        print "\n";
    }

我得到的响应是RedirectError,如下所示:

1|#||4|87|pageRedirect||%2fContacto.aspx%3faspxerrorpath%3d%2fSIM%2fVISTA%2fREPORTES%2fH_CapacidadTransfer.aspx|

但是该请求必须返回HTML代码,该代码包含我需要的文件链接。

也许我遗漏了一些我不知道的东西。我真的很感谢任何帮助或解决方案。

谢谢

0 个答案:

没有答案