为node.js中的天蓝色Blob流设置contentEncoding

时间:2018-10-18 09:44:35

标签: node.js azure character-encoding azure-storage-blobs

所以我有一个ISO-8859-1编码的文本文件存储在一个天蓝色的blob中,我需要使用流逐行处理它,因为文件可能很大。

我确保上传的Blob的contentEncoding正确,并尝试了各种设置,例如ISO-8859-1,latin1,binary,...

我可以像这样将infile下载到磁盘,并且可以在本地恢复下载文件的编码:

const stream = require('stream');
const storage = require('azure-storage');
const blobService = storage.createBlobService();
const containerName = 'myContainer';
const file = 'in.txt';

let readable = blobService.createReadStream(containerName, file, {encoding: 'latin1'});

let outstream = fs.createWriteStream('./out.txt');
readable.pipe(outstream);

我真正想做的而不是将文件下载到磁盘上,是通过readline在流中进行工作,并在数据进入时解析文件:

const stream = require('stream');
const storage = require('azure-storage');
const blobService = storage.createBlobService();
const containerName = 'myContainer';
const file = 'in.txt';

let readable = blobService.createReadStream(containerName, file, {encoding: 'latin1'});
readable.setEncoding('latin1');
let rl = readline.createInterface(readable);

rl.on('line', function(line) {
    console.log('line: ' + line);
});

rl.on('close', function() {
    console.log('Stream closed.');
});

以上代码不起作用,但返回错误:

TypeError: readable.setEncoding is not a function

当我省略setEncoding时,流将由readline处理,但编码不正确。

我也尝试过azure的getBlobText(尽管这不是我想要的),它将无法验证contentMD5并为包含拉丁字符的文件返回错误:

Error: Hash mismatch (integrity check failed), Expected value is jd2+sYDibe5GCw5JwpMhpg==, retrieved q/IlYBBNY6XluFzKKPq7hw==.
at BlobService._validateLengthAndMD5

如何在不将文件下载到磁盘的情况下摆脱困境?

0 个答案:

没有答案