从Firebase存储将大型json文件传输到Firestore

时间:2018-01-19 03:21:35

标签: firebase firebase-storage google-cloud-functions google-cloud-firestore

我需要帮助使用Firebase功能将大型JSON文件从Firebase存储流式传输到Firestore。

我想将几个大的换行JSON文件(11 x 700MB)传输到Firestore。我正在尝试从Firebase存储加载它们,流式传输文件,并将内容写入Firestore集合。

当我在一个非常小的json文件上测试时,我正在读取文件读取(来自存储)的错误。我获得了读写访问权限,我可以看到正在创建Firestore文档(但有时只是)。

我在Firebase Functions控制台上收到此错误:

  

错误:截止日期超出/user_code/node_modules/firebase-admin/node_modules/grpc/src/client.js:554:15

这也是来自Storage的读取,因为我已经对正在触发的读取错误设置了警报。

const functions = require('firebase-functions');


const admin = require('firebase-admin');
admin.initializeApp(functions.config().firebase);
const es = require('event-stream')
const Parser = require('newline-json').Parser
const gcs = require('@google-cloud/storage')();
const path = require('path');

// [START function]
exports.generateData = functions.storage.object().onChange(event => {
  const object = event.data; // The Storage object.

  const fileBucket = object.bucket; // The Storage bucket that contains the file.
  const filePath = object.name; // File path in the bucket.
  const contentType = object.contentType; // File content type.
  const resourceState = object.resourceState; // The resourceState is 'exists' or 'not_exists' (for file/folder deletions).
  const metageneration = object.metageneration; // Number of times metadata has been generated. New objects have a value of 1.

  // Exit if this is triggered on a file that is not JSON.
  if (!contentType.endsWith('json')) {
    console.log('This is not a json file.');
    return;
  }

  // Exit if this is a move or deletion event.
  if (resourceState === 'not_exists') {
    console.log('This is a deletion event.');
    return;
  }

  // Exit if file exists but is not new and is only being triggered
  // because of a metadata change.
  if (resourceState === 'exists' && metageneration > 1) {
    console.log('This is a metadata change event.');
    return;
  }

  // Download file from bucket.
  const bucket = gcs.bucket(fileBucket);

let buf = []

  const getStream = function () {
      let stream = bucket.file(filePath).createReadStream().on('error', () => { console.log('Read Error')}).on('end', () => {console.log('Successful Read')})
      let parser = new Parser()
      return stream.pipe(parser)
  }

  getStream()
   .pipe(es.mapSync(function (data) {
     buf.push(data)
     pump()
   }))
   .on('end', () => {
     console.log("Strem Finished")
     return true
   })
   .on('error', () => {
     console.log('Stream Error')
     return false
   })

   function pump() {
     let pos;

     while((pos = buf.length) >= 1) {
       processLine(buf.pop(0))
     }
   }

   function processLine(line) {
     admin.firestore().collection('test').add(line)
   }

});

我正在返回Read Error - 所以读操作必须要死了。

我现在不知道该怎么做,但感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

云功能的最长执行时间为540秒,因此可能不适合您的需求。考虑设置一个小GCE实例来进行迁移。