Question

在网页中，我必须阅读文件的一小部分，这对于许多（1500 - 12000）小文件，每个大约1 Mb大。收集到我需要的信息后，我将其推回服务器。

我的问题：我使用FileReader API，垃圾收集不起作用，内存消耗爆炸。

代码如下：

function extract_information_from_files(input_files) {

//some dummy implementation
for (var i = 0; i < input_files.length; ++i) {


    (function dummy_function(file) {

        var reader = new FileReader();

        reader.onload = function () {

            //convert to Uint8Array because used library expects this

            var array_buffer = new Uint8Array(reader.result);

            //do some fancy stuff with the library (very small subset of data is kept)

            //finish

            //function call ends, expect garbage collect to start cleaning.
            //even explicit dereferencing does not work
        };

        reader.readAsArrayBuffer(file);

    })(input_files[i]);

}

}

一些评论：

不，乍一看，库似乎没有保留对加载对象的任何引用。即使您运行上面显示的代码，但根本没有使用 array_buffer ，所有内容都会被保存到内存中。
行为因浏览器而异：
Chrome（43）并未清除所有内容
Firefox（38）似乎使用的剩余内存使用量约为所有文件大小的1/3
我发现在互联网上讨论相同问题的主题很少。我试过的是：
Is it possible to clean memory after FileReader? - ＆gt;旧的，File.prototype.mozSlice已经改为.slice，但即使这样，问题仍然存在
http://www.joelandritsch.com/posts/lessons-learned-in-javascript-11 - ＆gt;建议的解决方案不起作用。
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management对我来说不是很清楚。 - ＆GT;似乎首先你不需要去引用（不需要查看对象而不能访问对象）但是它们也说明了“限制：对象需要明确无法访问”

最后一个奇怪的细节（发布完整性），当使用FileReader结合https://gildas-lormeau.github.io/zip.js/时，我在将文件推送到zip存档之前读取文件，垃圾收集工作正常。

所有这些评论似乎都指向我无法使用FileReader，所以请告诉我如何。

Answer 1

问题可能与执行顺序有关。在for循环中，您正在阅读reader.readAsArrayBuffer(file)的所有文件。此代码将在为读者运行任何onload之前运行。根据{{1}}的浏览器实现，这可能意味着浏览器在调用任何FileReader之前加载整个文件（或者只是为整个文件预先分配缓冲区）。

尝试处理像队列一样的文件，看看它是否有所作为。类似的东西：

onload

编辑：浏览器似乎希望您重复使用function extract_information_from_files(input_files) { var reader = new FileReader(); function process_one() { var single_file = input_files.pop(); if (single_file === undefined) { return; } (function dummy_function(file) { //var reader = new FileReader(); reader.onload = function () { // do your stuff // process next at the end process_one(); }; reader.readAsArrayBuffer(file); })(single_file); } process_one(); } extract_information_from_files(file_array_1); // uncomment next line to process another file array in parallel // extract_information_from_files(file_array_2);。我编辑了代码以重用单个阅读器并测试（在chrome中）内存使用量仅限于您阅读的最大文件。

FileReader：使用javascript读取许多文件，没有内存泄漏

1 个答案: