Phantomjs有这两个非常方便的回调onLoadStarted
和onLoadFinished
,它们允许您在页面加载时基本上暂停执行。但我一直在搜索,如果您click()
提交按钮或超链接,我找不到相应的内容。发生了类似的页面加载,但我猜这个事件没有调用onLoadStarted
,因为没有明确的page.open()
发生。我正试图找出一种干净的方法来暂停执行此负载。
一个解决方案显然是嵌套的setTimeout,但是我想避免这种情况,因为它是hacky并且依赖于试验和错误而不是可靠和更强大的东西,比如测试某些东西或等待事件。
我错过了这种页面加载的特定回调吗?或者也许有某种通用代码模式可以处理这类事情?
编辑:
我还没弄明白如何让它暂停。这是我调用onLoadStarted()
命令时不调用click()
函数的代码:
var loadInProgress = false;
page.onLoadStarted = function() {
loadInProgress = true;
console.log("load started");
};
page.onLoadFinished = function() {
loadInProgress = false;
console.log("load finished");
};
page.open(loginPage.url, function (status) {
if (status !== 'success') {
console.log('Unable to access network');
fs.write(filePath + errorState, 1, 'w');
phantom.exit();
} else {
page.evaluate(function (loginPage, credentials) {
console.log('inside loginPage evaluate function...\n')
document.querySelector('input[id=' + loginPage.userId + ']').value = credentials.username;
document.querySelector('input[id=' + loginPage.passId + ']').value = credentials.password;
document.querySelector('input[id=' + loginPage.submitId + ']').click();
//var aTags = document.getElementsByTagName('a')
//aTags[1].click();
}, loginPage, credentials);
page.render(renderPath + 'postLogin.png');
console.log('rendered post-login');
我仔细检查了id是否正确。 page.render()
将显示信息已提交,但仅当我将其放入setTimeout()时,否则它会立即呈现它,我只会在页面重定向之前看到输入的凭据。也许我错过了别的什么?
答案 0 :(得分:13)
我认为onLoadStarted
和onLoadFinished
功能是您需要的一切。以下面的脚本为例:
var page = require('webpage').create();
page.onResourceReceived = function(response) {
if (response.stage !== "end") return;
console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + response.url);
};
page.onResourceRequested = function(requestData, networkRequest) {
console.log('Request (#' + requestData.id + '): ' + requestData.url);
};
page.onUrlChanged = function(targetUrl) {
console.log('New URL: ' + targetUrl);
};
page.onLoadFinished = function(status) {
console.log('Load Finished: ' + status);
};
page.onLoadStarted = function() {
console.log('Load Started');
};
page.onNavigationRequested = function(url, type, willNavigate, main) {
console.log('Trying to navigate to: ' + url);
};
page.open("http://example.com", function(status){
page.evaluate(function(){
// click
var e = document.createEvent('MouseEvents');
e.initMouseEvent('click', true, true, window, 0, 0, 0, 0, 0, false, false, false, false, 0, null);
document.querySelector("a").dispatchEvent(e);
});
setTimeout(function(){
phantom.exit();
}, 10000);
});
打印
Trying to navigate to: http://example.com/ Request (#1): http://example.com/ Load Started New URL: http://example.com/ Response (#1, stage "end"): http://example.com/ Load Finished: success Trying to navigate to: http://www.iana.org/domains/example Request (#2): http://www.iana.org/domains/example Load Started Trying to navigate to: http://www.iana.org/domains/reserved Request (#3): http://www.iana.org/domains/reserved Response (#2, stage "end"): http://www.iana.org/domains/example New URL: http://www.iana.org/domains/reserved Request (#4): http://www.iana.org/_css/2013.1/screen.css Request (#5): http://www.iana.org/_js/2013.1/jquery.js Request (#6): http://www.iana.org/_js/2013.1/iana.js Response (#3, stage "end"): http://www.iana.org/domains/reserved Response (#6, stage "end"): http://www.iana.org/_js/2013.1/iana.js Response (#4, stage "end"): http://www.iana.org/_css/2013.1/screen.css Response (#5, stage "end"): http://www.iana.org/_js/2013.1/jquery.js Request (#7): http://www.iana.org/_img/2013.1/iana-logo-header.svg Request (#8): http://www.iana.org/_img/2013.1/icann-logo.svg Response (#8, stage "end"): http://www.iana.org/_img/2013.1/icann-logo.svg Response (#7, stage "end"): http://www.iana.org/_img/2013.1/iana-logo-header.svg Request (#9): http://www.iana.org/_css/2013.1/print.css Response (#9, stage "end"): http://www.iana.org/_css/2013.1/print.css Load Finished: success
它显示单击一个链接会发出一次LoadStarted事件和两次NavigationRequested事件,因为存在重定向。诀窍是在执行操作之前添加事件处理程序:
var page = require('webpage').create();
page.open("http://example.com", function(status){
page.onLoadFinished = function(status) {
console.log('Load Finished: ' + status);
page.render("test37_next_page.png");
phantom.exit();
};
page.onLoadStarted = function() {
console.log('Load Started');
};
page.evaluate(function(){
var e = document.createEvent('MouseEvents');
e.initMouseEvent('click', true, true, window, 0, 0, 0, 0, 0, false, false, false, false, 0, null);
document.querySelector("a").dispatchEvent(e);
});
});
如果您需要做这些事情,也许是时候尝试其他类似CasperJS的事情了。它运行在PhantomJS之上,但有一个更好的API用于浏览网页。
答案 1 :(得分:8)
使用高级包装器nightmarejs。
您可以轻松click
然后等待。
以下是代码(示例部分):
var Nightmare = require('nightmare');
new Nightmare()
.goto('http://yahoo.com')
.type('input[title="Search"]', 'github nightmare')
.click('.searchsubmit')
.run(function (err, nightmare) {
if (err) return console.log(err);
console.log('Done!');
});
可以在github
找到更多示例和API用法答案 2 :(得分:0)
这是我的代码基于其他一些答案。在我的情况下,我不需要专门评估任何其他JavaScript。我只需要等待页面完成加载。
var system = require('system');
if (system.args.length === 1) {
console.log('Try to pass some arguments when invoking this script!');
}
else {
var page = require('webpage').create();
var address = system.args[1];
page.open(address, function(status){
page.onLoadFinished = function(status) {
console.log(page.content);
phantom.exit();
};
});
}
将上述内容保存在名为“scrape.js”的文件中并以这种方式调用:
phantomjs --ssl-protocol=any --ignore-ssl-errors=true scrape.js https://www.example.com
添加了与SSL相关的参数,以避免我在某些HTTPS站点遇到的其他问题(与证书加载问题相关)。
希望这有助于某人!