我试图通过加入2个csv输入流来生成输出文件,对于csv 1中的每个记录,我想为csv 2中的每个记录生成输出。
我在浏览堆栈溢出的任何类似解决方案时遇到了高地,并且遇到了:
Nested stream operations in Highland.js
我已经尝试将其调整为适合自己的问题,并且到目前为止:
const debug = require('debug')('csvparse');
const csv = require('fast-csv');
const fs = require('fs');
const args = process.argv;
const h = require('highland');
const typestream = h(fs.createReadStream(args[2]).pipe(csv({ headers: true, ignoreEmpty: true })));
const postcodestream = h(fs.createReadStream(args[3]).pipe(csv({ headers: true, ignoreEmpty: true })));
const pipeline = typestream.flatMap((type) => {
debug(type);
return postcodestream.flatMap((postcode) => {
debug(postcode);
return h([`${type.type}-${postcode.postcode}\n`]);
});
});
pipeline.pipe(process.stdout);
使用以下示例输入 csv1:
type,
STREET,
ROAD,
csv2:
postcode,
3456
3446
1234
Id预期输出
STREET-3456
STREET-3446
STREET-1234
ROAD-3456
ROAD-3446
ROAD-1234
但是我刚刚得到:
STREET-3456
STREET-3446
STREET-1234
我从调试语句中看到我退出了ROAD,然后停止了。
答案 0 :(得分:0)
好吧,我发现了我的问题,基本上我应该一直使用csv进行解析,而不是包装管道,我还应该在初始flatMap中创建fs.createReadStream,而不是从变量中引用它(如流将在初始迭代后完成)。
代码现在为:
#!/usr/bin/node
const debug = require('debug')('csvparse');
const csv = require('fast-csv');
const fs = require('fs');
const args = process.argv;
const h = require('highland');
const pipeline = h(fs.createReadStream(args[2]))
.through(csv({ headers: true, ignoreEmpty: true }))
.flatMap((type) => {
return h(fs.createReadStream(args[3]))
.through(csv({ headers: true, ignoreEmpty: true }))
.map((postcode) => {
return `${type.type}-${postcode.postcode}\n`;
});
});
pipeline.pipe(process.stdout);