使用CasperJS检查AJAX加载的JS对象/类?

时间:2016-06-08 08:50:47

标签: javascript ajax phantomjs casperjs

我使用与Checking JavaScript AJAX loaded resources with Mink/Zombie in PHP?中相同的示例:

test_JSload.php

<?php
if (array_key_exists("QUERY_STRING", $_SERVER)) {
  if ($_SERVER["QUERY_STRING"] == "getone") {
    echo "<!doctype html>
  <html>
  <head>
  <script src='test_JSload.php?gettwo'></script>
  </head>
  </html>
  ";
    exit;
  }

  if ($_SERVER["QUERY_STRING"] == "gettwo") {
    header('Content-Type: application/javascript');
    echo "
  function person(firstName) {
    this.firstName = firstName;
    this.changeName = function (name) {
        this.firstName = name;
    };
  }
  ";
    exit;
  }
}
?>
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
  <style type="text/css">
.my_btn { background-color:yellow; }
  </style>
  <script src="http://code.jquery.com/jquery-1.12.4.min.js"></script>
  <script type="text/javascript">
var thishref = window.location.href.slice(0, window.location.href.indexOf('?')+1);
var qstr = window.location.href.slice(window.location.href.indexOf('?')+1);

function OnGetdata(inbtn) {
  console.log("OnGetdata; loading ?getone via AJAX call");
  //~ $.ajax(thishref + "?getone", { // works
  var ptest = {}; // init as empty object
  console.log(" ptest pre ajax is ", ptest);

  $.ajax({url: thishref + "?getone",
    async: true, // still "Synchronous XMLHttpRequest on the main thread is deprecated", because we load a script; https://stackoverflow.com/q/24639335
    success: function(data) {
      console.log("got getone data "); //, data);
      $("#dataholder").html(data);
      ptest = new person("AHA");
      console.log(" ptest post getone is ", ptest);
    },
    error: function(xhr, ajaxOptions, thrownError) {
      console.log("getone error " + thishref + " : " + xhr.status + " / " + thrownError);
    }
  });

  ptest.changeName("Somename");
  console.log(" ptest post ajax is ", ptest);
}

ondocready = function() {
  $("#getdatabtn").click(function(){
    OnGetdata(this);
  });
}
$(document).ready(ondocready);
  </script>
</head>


<body>
  <h1>Hello World!</h1>

  <button type="button" id="getdatabtn" class="my_btn">Get Data!</button>
  <div id="dataholder"></div>
</body>
</html>

然后,您可以使用PHP&gt;运行临时服务器; 5.4 CLI(命令行),在同一目录(.php文件)中:

php -S localhost:8080

...最后,您可以访问http://127.0.0.1:8080/test_JSload.php页面。

简单来说,在这个页面中,当单击按钮时,JavaScript类会在两次传递中加载 - 首先是一个带有<script>标记的HTML,其脚本将在第二次传递中加载。用于此操作的Firefox在控制台中打印:

OnGetdata; loading ?getone via AJAX call      test_JSload.php:13:3
 ptest pre ajax is  Object {  }               test_JSload.php:16:3
TypeError: ptest.changeName is not a function test_JSload.php:31:3
got getone data                               test_JSload.php:21:7
Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user's experience. For more help http://xhr.spec.whatwg.org/ jquery-1.12.4.min.js:4:26272
 ptest post getone is  Object { firstName: "AHA", changeName: person/this.changeName(name) } test_JSload.php:24:7

我最终想检查CasperJS中的ptest变量或person类。到目前为止,我制作了这个剧本:

test_JSload_casper.js

// run with:
// ~/.nvm/versions/node/v4.0.0/lib/node_modules/casperjs/bin/casperjs test_JSload_casper.js
// based on http://code-epicenter.com/how-to-login-to-amazon-using-casperjs-working-example/

var casper = require('casper').create({
  pageSettings: {
    loadImages: false,//The script is much faster when this field is set to false
    loadPlugins: false,
    userAgent: 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36'
  }
});

//First step is to open page
casper.start().thenOpen("http://127.0.0.1:8080/test_JSload.php", function() {
  console.log("website opened");
});

//Second step is to click to the button
casper.then(function(){
   this.evaluate(function(){
    document.getElementById("getdatabtn").click();
   });
});

//Wait for JS to execute?!, then inspect
casper.then(function(){
  console.log("After login...");
  console.log("AA " + JSON.stringify(person));
});

casper.run();

...但是,当我运行这个CasperJS脚本时,我得到了:

$ ~/.nvm/versions/node/v4.0.0/lib/node_modules/casperjs/bin/casperjs test_JSload_casper.js
website opened
After login...

......别无其他。请注意,最后一行console.log("AA " + JSON.stringify(person));甚至没有部分执行(即,没有&#34; AA&#34;被打印,也没有任何类型的错误消息)。

那么,是否可以使用Casper JS来检查这些资源(AJAX加载的JS对象/类,可能在多个运行/步骤中加载) - 如果是,那么如何?

2 个答案:

答案 0 :(得分:1)

通过点击触发的Ajax请求可能没有足够的时间来对您正在抓取的页面产生影响。确保使用众多wait*函数之一等待它完成。如果由于Ajax请求而更改了DOM,那么我建议waitForSelector

相关的问题是页面的JavaScript被破坏了。由于填充ptest的Ajax请求是异步的,因此ptest.changeName("Somename")在响应到达之前执行,从而导致TypeError。您可以将ptest.changeName(...)移动到Ajax请求的success回调。

要查看来自页面的控制台消息,您必须收听'remote.message' event

casper.on("remote.message", function(msg){
    this.echo("remote> " + msg);
});

casper.start(...)...

答案 1 :(得分:0)

我会将此作为部分答案发布,因为至少我设法打印person类 - 诀窍是使用casper.evaluate来运行脚本(即console.log(person) )好像在远程页面(见下文)。但是,仍有一些问题我不清楚(我很乐意接受澄清的答案):

  • person类应仅在?gettwo请求完成后才存在,并且已检索到相应的JS;但是,casperjs仅报告对?getone的调用,而不是?gettwo!为什么?
  • 如果我尝试在最终JSON.stringify(person)中使用__utils__.echo('plop');.then(...,则会中断脚本执行,就像发生了致命错误一样 - 但是,没有报告相关错误即使我收听多条消息;为什么呢?

否则,这是修改后的test_JSload_casper.js文件:

// run with:
// ~/.nvm/versions/node/v4.0.0/lib/node_modules/casperjs/bin/casperjs test_JSload_casper.js

var casper = require('casper').create({
  verbose: true,
  logLevel: 'debug',
  userAgent: 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36',
  pageSettings: {
    loadImages: false,//The script is much faster when this field is set to false
    loadPlugins: false
  }
});


casper.on('remote.message', function(message) {
  this.echo('remote message caught: ' + message);
});

casper.on('resource.received', function(resource) {
  var status = resource.status;
  casper.log('Resource received ' + resource.url + ' (' + status + ')');
});

casper.on("resource.error", function(resourceError) {
  this.echo("Resource error: " + "Error code: "+resourceError.errorCode+" ErrorString: "+resourceError.errorString+" url: "+resourceError.url+" id: "+resourceError.id, "ERROR");
});

casper.on("page.error", function(msg, trace) {
  this.echo("Page Error: " + msg, "ERROR");
});

// http://docs.casperjs.org/en/latest/events-filters.html#page-initialized
casper.on("page.initialized", function(page) {
  // CasperJS doesn't provide `onResourceTimeout`, so it must be set through
  // the PhantomJS means. This is only possible when the page is initialized
  page.onResourceTimeout = function(request) {
    console.log('Response Timeout (#' + request.id + '): ' + JSON.stringify(request));
  };
});


//Second step is to click to the button
casper.then(function(){
   this.evaluate(function(){
    document.getElementById("getdatabtn").click();
   });
   //~ this.wait(2000, function() { // fires, but ?gettwo never gets listed
    //~ console.log("Done waiting");
   //~ });

  //~ this.waitForResource(/\?gettwo$/, function() { // does not ever fire: "Wait timeout of 5000ms expired, exiting."
    //~ this.echo('a gettwo has been loaded.');
  //~ });
});

//Wait for JS to execute?!, then inspect
casper.then(function(){
  console.log("After login...");

  // Code inside of this function will run
  // as if it was placed inside the target page.
  casper.evaluate(function(term) {
    //~ console.log("EEE", ptest); // Page Error: ReferenceError: Can't find variable: ptest
    console.log("EEE", person); // does dump the class function
  });

  __utils__.echo('plop'); // script BREAKS here....
  console.log("BB ");
  console.log("AA " + JSON.stringify(person));
});

casper.run();

这个输出是:

$ ~/.nvm/versions/node/v4.0.0/lib/node_modules/casperjs/bin/casperjs test_php_mink/test_JSload_casper.js 
[info] [phantom] Starting...
[info] [phantom] Running suite: 4 steps
[debug] [phantom] opening url: http://127.0.0.1:8080/test_JSload.php, HTTP GET
[debug] [phantom] Navigation requested: url=http://127.0.0.1:8080/test_JSload.php, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] Resource received http://127.0.0.1:8080/test_JSload.php (200)
[debug] [phantom] url changed to "http://127.0.0.1:8080/test_JSload.php"
[debug] [phantom] Resource received http://127.0.0.1:8080/test_JSload.php (200)
[debug] [phantom] Resource received http://code.jquery.com/jquery-1.12.4.min.js (200)
[debug] [phantom] Resource received http://code.jquery.com/jquery-1.12.4.min.js (200)
[debug] [phantom] Successfully injected Casper client-side utilities
[info] [phantom] Step anonymous 2/4 http://127.0.0.1:8080/test_JSload.php (HTTP 200)
website opened
[info] [phantom] Step anonymous 2/4: done in 312ms.
[info] [phantom] Step anonymous 3/4 http://127.0.0.1:8080/test_JSload.php (HTTP 200)
remote message caught: OnGetdata; loading ?getone via AJAX call
remote message caught:  ptest pre ajax is  [object Object]
Page Error: TypeError: undefined is not a function (evaluating 'ptest.changeName("Somename")')
[info] [phantom] Step anonymous 3/4: done in 337ms.
[debug] [phantom] Resource received http://127.0.0.1:8080/test_JSload.php?getone (200)
[debug] [phantom] Resource received http://127.0.0.1:8080/test_JSload.php?getone (200)
remote message caught: got getone data 
remote message caught:  ptest post getone is  [object Object]
[info] [phantom] Step anonymous 4/4 http://127.0.0.1:8080/test_JSload.php (HTTP 200)
After login...
remote message caught: EEE function person(firstName) {
    this.firstName = firstName;
    this.changeName = function (name) {
        this.firstName = name;
    };
  }
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "about:blank"

从“EEE”消息可以看出,person类(函数)被正确报告 - 即使http://127.0.0.1:8080/test_JSload.php?gettwo(定义它)从未被列为加载资源..