Question

我正在尝试阅读大文件。目前，我正在关注如何读取大文件的NodeJS文档，但是当我读取一个稍大的文件（~1.1 MB，~20k行）时，我的Electron应用程序冻结了大约6分钟然后应用程序完成加载所有文件线条。

这是我当前的代码

var fileContents = document.getElementById("fileContents")
        //first clear out the existing text
        fileContents.innerHTML = ""
        if(fs.existsSync(pathToFile)){
            const fileLine = readline.createInterface({
                input: fs.createReadStream(pathToFile)
            })

            fileLine.on('line', (line) => {
                fileContents.innerHTML += line + "\n"
            })


       } else {
            fileContents.innerHTML += fileNotFound + "\n"
            console.log('Could not find file!!')
       }

我定位的代码是<xmp>代码。

人们有哪些显示大文件的方式？

Answer 1

Streams对于高性能通常很有用，因为它们允许您一次处理一行而无需将整个文件加载到内存中。

但是，在这种情况下，您要加载每一行，然后使用fileContents.innerHTML连接到现有字符串（+=）。连接的所有内容可能比仅将文件的整个内容加载为一个字符串要慢。更糟糕的是，每次读一行时都会输出HTML。因此，对于20k行，您要求渲染引擎呈现20,000次HTML！

相反，尝试将文件作为一个字符串读取，并输出HTML一次。

fs.readFile(pathToFile, (err, data) => {
  if (err) throw err;
  fileContents.innerHTML = data;
});

Answer 2

fs.readFile（）的问题在于，您将无法打开大文件（例如600Mb），无论如何，您都需要对非常大的文件使用流。

Answer 3

我正在使用Node and Electron编写一个名为AminoSee的基因组学应用程序。当我开始尝试提取大于2 GB的文件时，由于程序试图将整个文件加载到内存中，我不得不切换到流式架构。由于我扫描文件，这显然很可笑。这是我的处理器的核心，来自CLI应用程序，位于：

来源：https://github.com/tomachinz/AminoSee/blob/master/aminosee-cli.js

  try {
    var readStream = fs.createReadStream(filename).pipe(es.split()).pipe(es.mapSync(function(line){
      readStream.pause(); // curious to test performance of removing
      streamLineNr++;
      processLine(line); // process line here and call readStream.resume() when ready
      readStream.resume();
    })
    .on('error', function(err){
      error('While reading file: ' + filename, err.reason);
      error(err)
    })
    .on('end', function() {
      log("Stream ending");
    })
    .on('close', function() {
      log("Stream closed");
      setImmediate( () => { // after a 2 GB file give the CPU 1 cycle breather!
        calcUpdate() ;
        saveDocuments();
      });
    }));
  } catch(e) {
    error("ERROR:"  + e)
  }

我经常使用setImmediate，因为在我了解回调和Promise之前，程序将远远领先于它自己！无疑是学习种族状况的好时机。仍然有100万个错误将使您成为一个好的学习项目。

NodeJS + Electron - 优化显示大文件

3 个答案: