我认为我没有正确有效地实现和使用可读流?

时间:2018-06-23 10:33:20

标签: javascript node.js stream

它是一个程序,可从文本文件“ IN.txt”中读取数据并将其以json格式写入到“ copy.json”文件中。 在文本文件的每一行中,单词都用制表符分隔,然后使用制表符将行拆分为数组。

我认为以这种方式实现可读流会一次又一次地覆盖相同的数据,这对于大文件来说效率不高。
我确实尝试了许多不同的方法,但是却遇到了诸如内存泄漏,未定义_read方法之类的错误。

let r#super = 0; // error: `r#super` is not currently supported.

这是IN.txt文件的小快照

IN.txt file

1 个答案:

答案 0 :(得分:0)

在对writeToFile的每次调用(基本上是在读取每一行时)上,您都在创建一个readStream并将dataArray复制到其中,从而传递到写流。如果已经对文件打开了读取流,则不需要此。

好读的文字:https://medium.freecodecamp.org/node-js-streams-everything-you-need-to-know-c9141306be93

请尝试以下: namespace CommandsImplementations { /// <summary> /// Processes and stores transmitted commands to and from the device /// </summary> public class Command : ICommand { //NOTE: Not sure if this is a good idea to hardcode a null command private const string _NullCommand = "$,!205"; public string String { private set; get; } public char Char { private set; get; } //Command character public string[] Args { private set; get; } //Arguments of the command public DateTime Time { private set; get; } public int Checksum { private set; get; } /**************************************** * CONSTRUCTORS * ****************************************/ public Command(string str) { //String = ScanStringForValidCommand(str); String = str; Time = DateTime.Now; Char = GetCommandChar(String); Args = GetArgs(String); Checksum = CalcChecksumOfString(String); } private int ChecksumCharsCount(string str) { return str.Count(c => c == '!'); } private int StartCharCount(string str) { return str.Count(c => c == '$'); } public Command(char command, string[] args) { Time = DateTime.Now; Args = args; Char = command; String = CommandToString(command, args); Checksum = CalcChecksumOfString(String); } /**************************************** * METHODS * ****************************************/ /// <summary> /// Updates the time that the command was handled /// </summary> public void UpdateTime() { Time = DateTime.Now; } /**************************************** * METHODS TO CONVERT COMMAND TO STRING * ****************************************/ private string GenerateStartCharAndCommand(char command) { return "$" + command; } private string AppendArgsToString(string str, string[] args) { foreach (string arg in args) { str += arg + ","; } return str; } private string AppendChecksumChar(string str) { str += (str.Last<char>() == ',') ? "!" : ",!"; return str; } private string CommandToString(char command, params string[] args) { string commandStr = GenerateStartCharAndCommand(command); commandStr = AppendArgsToString(commandStr, args); commandStr = AppendChecksumChar(commandStr); commandStr += CalcChecksumOfString(commandStr).ToString(); return commandStr; } /// <summary> /// Returns the sumation of all of the charactes in the string /// </summary> /// <param name="str">String to summate</param> /// <returns>Sum of all characters in the string</returns> private int CalcChecksumOfString(string str) { str = RemoveChecksumNumFromString(str); return str.Sum(b => b); } /**************************************** * METHODS TO CONVERT STRING TO COMMAND * ****************************************/ private char GetCommandChar(string str) { char result = '\0'; if (IsACharFoundAfterStartChar(str)) { result = str[str.IndexOf('$') + 1]; } return result; } private bool IsACharFoundAfterStartChar(string str) { return (str.IndexOf('$') + 1) < str.Length; } private string[] GetArgs(string str) { str = RemoveStartAndCommandChar(str); str = RemoveChecksumAndChar(str); return SplitStringIntoArgs(str); } private string[] SplitStringIntoArgs(string str) { string[] strArray = new string[] { }; if (!string.IsNullOrEmpty(str)) { strArray = str.Split(','); } return strArray; } private string RemoveChecksumAndChar(string str) { int indexOfChecksum = str.IndexOf("!") - 1; if (indexOfChecksum >= 0) { int RemoveLength = str.Length - indexOfChecksum; //Calculate how many chars to remove after the checksum char str = str.Remove(indexOfChecksum, RemoveLength); //Remove string from start of checksum to end } return str; } private string RemoveChecksumNumFromString(string str) { int indexOfChecksum = str.IndexOf("!") + 1; if (indexOfChecksum >= 0) { int RemoveLength = str.Length - indexOfChecksum; //Calculate how many chars to remove after the checksum char str = str.Remove(indexOfChecksum, RemoveLength); //Remove string from start of checksum to end } return str; } private string RemoveStartAndCommandChar(string str) { int indexOfStartChar = str.IndexOf('$'); if (indexOfStartChar >= 0 && IsACharFoundAfterStartChar(str)) { str = str.Remove(indexOfStartChar, 2); //Find start char '$' and delete from beginning of string and one char after '$' } return str; } /**************************************** * CHECKSM CALCULATIONS * ****************************************/ /// <summary> /// Remove string /// </summary> /// <param name="str"></param> /// <returns></returns> private int GetChecksumOfString(string str) { str = str.Remove(0, str.IndexOf(",!") + 2); int.TryParse(str, out int checksum); return checksum; } public bool IsChecksumPass() { return (Checksum != 0) && (GetChecksumOfString(String) == CalcChecksumOfString(String)); } } } 给了我大约14 MB的IN.txt文件大约147 MB​​的内存堆。

process.memoryUsage().heapUsed / 1024 / 1024

更有效的解决方案:

const fs = require('fs'); const readLine = require('readline'); const { Readable } = require('stream'); const output = fs.createWriteStream(__dirname + '/copy.json'); const dataArray = []; //creating readline interface const lineReader = readLine.createInterface({ input: fs.createReadStream(__dirname + '/IN.txt') }); const fields = ['country', 'pin', 'place', 'state', 'code', 'division', 'admin', 'mandal', 'xxx', 'lat', 'long']; //reading data from text file line by line and pushing it to an array lineReader.on('line', function (line) { let words = line.split('\t'); dataArray.push(getLineContent(fields, words)); }); lineReader.on('close', function (line) { console.log('***Finished***'); output.write(JSON.stringify(dataArray, null, 4)); output.end(); process.exit(0); }); //words array will be like ["IN","744301", "Mus Andaman & Nicobar Islands", "01 Nicobar 638 Carnicobar" , "9.2333", "92.7833","4"] //creating obj with fields and words function getLineContent(fields, words) { var obj = {}; for(let i = 0; i < fields.length; i++) { obj[fields[i]] = words[i]; } return obj; } 给了我大约14 MB的IN.txt文件,使用了大约5-7 MB的内存堆(与上述方法相比有显着改进)。

更多参考文字:

  1. Stream highWaterMark misunderstanding
  2. Pausing readline in Node.js
  3. https://www.valentinog.com/blog/memory-usage-node-js/

以下内容可能会帮助您快速入门:

process.memoryUsage().heapUsed / 1024 / 1024