如何使用Node.js SDK将Azure Blob存储中的Append Blob读取为字符串?

时间:2019-01-04 12:42:57

标签: node.js azure azure-storage-blobs

我在这里https://github.com/Azure/azure-storage-js/blob/master/blob/samples/basic.sample.js的示例下,涉及使用Node.js SDK将Azure Blob存储中的Blob读取为字符串。

我要读取的Blob是一个Append Blob。

首先将流读入字符串会花费很长时间,最后我收到HTTP 412错误。

我也在这里问了这个问题:https://github.com/Azure/azure-storage-js/issues/51

我正在使用Node.js v10.14.1进行此操作,而我使用的SDK是@ azure / storage-blob @ 10.3.0。

我的代码在这里:

const {
  Aborter,
  BlobURL,
  ContainerURL,
  SharedKeyCredential,
  ServiceURL,
  StorageURL,
} = require('@azure/storage-blob');
const format = require('date-fns/format');

async function streamToString(readableStream) {
  return new Promise((resolve, reject) => {
    const chunks = [];
    readableStream.on('data', (data) => {
      chunks.push(data.toString());
    });
    readableStream.on('end', () => {
      resolve(chunks.join(''));
    });
    readableStream.on('error', reject);
  });
}

async function run() {
  const accountName = 'xxxstor';
  const accountKey = 'omitted';
  const credential = new SharedKeyCredential(accountName, accountKey);
  const pipeline = StorageURL.newPipeline(credential);
  const serviceURL = new ServiceURL(
    `https://${accountName}.blob.core.windows.net`,
    pipeline
  );
  const containerName = 'request-logs';
  const containerURL = ContainerURL.fromServiceURL(serviceURL, containerName);
  const blobName = `${format(new Date(), 'YYYY-MM-DD[.txt]')}`;
  const blobURL = BlobURL.fromContainerURL(containerURL, blobName);
  console.log('Downloading blob...');
  const response = await blobURL.download(Aborter.none, 0);
  console.log('Reading response to string...');
  const body = await streamToString(response.);
  console.log(body.length);
}

run().catch((err) => {
  console.error(err);
});

我得到的错误是:

{ Error: Unexpected status code: 412
    at new RestError (C:\projects\xxx\RequestLogViewer\node_modules\@azure\ms-rest-js\dist\msRest.node.js:1397:28)
    at C:\projects\xxx\RequestLogViewer\node_modules\@azure\ms-rest-js\dist\msRest.node.js:1849:37
    at process._tickCallback (internal/process/next_tick.js:68:7)
  code: undefined,
  statusCode: 412,
  request:
  WebResource {
    streamResponseBody: true,
    url:
      'https://xxxstor.blob.core.windows.net/request-logs/2019-01-04.txt',
    method: 'GET',
    headers: HttpHeaders { _headersMap: [Object] },
    body: undefined,
    query: undefined,
    formData: undefined,
    withCredentials: false,
    abortSignal:
      a {
        _aborted: false,
        children: [],
        abortEventListeners: [Array],
        parent: undefined,
        key: undefined,
        value: undefined },
    timeout: 0,
    onUploadProgress: undefined,
    onDownloadProgress: undefined,
    operationSpec:
      { httpMethod: 'GET',
        path: '{containerName}/{blob}',
        urlParameters: [Array],
        queryParameters: [Array],
        headerParameters: [Array],
        responses: [Object],
        isXML: true,
        serializer: [Serializer] } },
  response:
  { body: undefined,
    headers: HttpHeaders { _headersMap: [Object] },
    status: 412 },
  body: undefined }

1 个答案:

答案 0 :(得分:0)

此问题已在GitHub问题https://github.com/Azure/azure-storage-js/issues/51中解决,将解决方案从GitHub问题复制到此处。

blobURL.download()将尝试将具有HTTP Get请求的blob下载到流中。当由于网络中断等原因导致流意外结束时,重试将使用新的HTTP Get请求恢复从断点读取的流。

第二个HTTP请求将使用条件标头IfMatch,并在第一个请求中返回blob的ETag,以确保在第二次重试发生时blob不会改变。否则,将返回412个条件标头不匹配错误。使用这种严格的策略来避免数据完整性问题,例如blob可能完全被其他人覆盖。但是,这种策略似乎可以避免重试发生时读取不断更新的日志文件。

虽然我不认为这是错误,但是我们需要使这种情况适合您。请尝试以下解决方案:首先快照附加blob,然后从快照blob中读取

const blobURL = BlobURL.fromContainerURL(containerURL, blobName);
console.log('Downloading blob...');
const snapshotResponse = await blobURL.createSnapshot(Aborter.none);
const snapshotURL = blobURL.withSnapshot(snapshotResponse.snapshot);
const response = await snapshotURL.download(Aborter.none, 0);
console.log('Reading response to string...', snapshotURL.blobContext.length);
const body = await streamToString(response.readableStreamBody);