Scala进程退出但不清理线程

时间:2013-04-02 21:44:36

标签: multithreading scala process phantomjs

我正在尝试使用akka actor在scala应用程序中运行PhantomJS:

val process = Process("phantomjs --ignore-ssl-errors=yes " + myrenderscript.js + args ...)
val result = process.run(processLogger, true).exitValue() match {
  case ExitCode.SUCCESS => Left(Success)
  case ExitCode.TIMEOUT => Right(TimeoutError)
  case ExitCode.OPEN_FAILED => Right(NetworkError)
  case _ => Right(UnknownError)        
}

myrenderscript.js看起来像这样:

var version = "1.1";

var TIMEOUT = 30000,
EXIT_SUCCESS = 0,
EXIT_TIMEOUT = 2,
EXIT_OPEN_FAILED = 3;


if (phantom.args.length < 2) {
    console.log("Usage: phantomjs render.js parentUrl output [width height]");
    phantom.exit(1);
}
var url = phantom.args[0];
var output = phantom.args[1];

var width = parseInt(phantom.args[2] || 1024);
var height = parseInt(phantom.args[3] || 1024);
var clipwidth = parseInt(phantom.args[4] || 1024);
var clipheight = parseInt(phantom.args[5] || 1024);
var zoom = parseFloat(phantom.args[6] || 1.0);
var phantom_version = phantom.version.major + "." + phantom.version.minor + "." +    phantom.version.patch;
var userAgentString = "PhantomJS/" + phantom_version + " screenshot-webservice/" + version;

renderUrlToFile(url, output, width, height, clipwidth, clipheight, zoom, userAgentString, function (url, file) {
    console.log("Rendered '" + url + "' at size (" + width + "," + height + ") into '" + output + "'");
    phantom.exit(EXIT_SUCCESS);
    phantom = null;
});

setTimeout(function () {
    console.error("Timeout reached (" + TIMEOUT + "ms): " + url);
    phantom.exit(EXIT_TIMEOUT);
}, TIMEOUT);


function renderUrlToFile(url, file, width, height, clipwidth, clipheight, zoom,    userAgentString, callback) {
    console.log("renderUrlToFile start: " + url)
    var page = new WebPage();
    page.viewportSize = { width: width, height: height };
    page.clipRect = { top: 0, left: 0, width: clipwidth, height: clipheight};
    page.settings.userAgent = userAgentString;
    page.zoomFactor = zoom;
    page.open(url, function (status) {
        console.log("renderUrlToFile open page: " + url)
        if (status !== "success") {
            console.log("Unable to render '" + url + "' (" + status + ")");
            page.release();
            page.close();
            page = null;
            phantom.exit(EXIT_OPEN_FAILED);
        } else {
            console.log("renderUrlToFile open page success and pre-render: " + url)
            page.render(file);
            console.log("renderUrlToFile open page post-render: " + url)
            page.release();
            page.close();
            page = null;
            callback(url, file);
        }
    });
}

在创建流程之前和运行完成之后,正在创建大约4个新线程。

每次调用创建进程的方法时,都会创建并启动新线程。完成该过程后,线程将返回监视状态。最终我的应用程序需要超过500个线程(我正在捕获一个大型网站和内部链接)

如何让scala清理运行phantomjs时创建的线程?

编辑:

我已更改scala代码以执行以下操作:

val process = Process("phantomjs --ignore-ssl-errors=yes " + myrenderscript.js + args ...).run(processLogger, connectInput)
val result = process.exitValue() match {
  case ExitCode.SUCCESS => Left(Success)
  case ExitCode.TIMEOUT => Right(TimeoutError)
  case ExitCode.OPEN_FAILED => Right(NetworkError)
  case _ => Right(UnknownError)        
}
process.destroy()

然而,线程依然存在......

1 个答案:

答案 0 :(得分:3)

我弄清楚为什么它没有清​​理线程,但我不完全理解它。所以如果有人在这里发布了真正的答案,我会投票给你答案。

问题是我将connectInput值设置为true。当我将其设置为false时,线程会按预期被破坏。我不确定为什么。

当设置为true时,线程转储显示其中一个线程阻塞了其他线程:

Thread-3@2830 daemon, prio=5, in group 'main', status: 'RUNNING'
 blocks Thread-63@4131
 blocks Thread-60@4127
 blocks Thread-57@4125
 blocks Thread-54@4121
 blocks Thread-51@4103
 blocks Thread-48@4092
 blocks Thread-45@4072
 blocks Thread-42@4061
 blocks Thread-39@4054
 blocks Thread-36@4048
 blocks Thread-33@4038
 blocks Thread-30@4036
 blocks Thread-27@4008
 blocks Thread-24@3996
 blocks Thread-21@3975
 blocks Thread-18@3952
 blocks Thread-15@3939
 blocks Thread-12@3905
 blocks Thread-9@3885
 blocks Thread-6@3850
  at java.io.FileInputStream.readBytes(FileInputStream.java:-1)
  at java.io.FileInputStream.read(FileInputStream.java:220)
  at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
  at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
  at java.io.FilterInputStream.read(FilterInputStream.java:116)
  at java.io.FilterInputStream.read(FilterInputStream.java:90)
  at scala.sys.process.BasicIO$.loop$1(BasicIO.scala:225)
  at scala.sys.process.BasicIO$.transferFullyImpl(BasicIO.scala:233)
  at scala.sys.process.BasicIO$.transferFully(BasicIO.scala:214)
  at scala.sys.process.BasicIO$.connectToIn(BasicIO.scala:183)
  at scala.sys.process.BasicIO$$anonfun$input$1.apply(BasicIO.scala:190)
  at scala.sys.process.BasicIO$$anonfun$input$1.apply(BasicIO.scala:189)
  at scala.sys.process.ProcessBuilderImpl$Simple$$anonfun$2.apply$mcV$sp(ProcessBuilderImpl.scala:72)
  at scala.sys.process.ProcessImpl$Spawn$$anon$1.run(ProcessImpl.scala:22)

我最初认为它是进程记录器,但事实并非如此。

有人可以向我解释一下吗?