我有一个从云存储加载.CSV文件的GCP函数,我需要让它跳过添加除.CSV文件之外的任何内容

时间:2018-03-14 12:07:24

标签: google-cloud-platform google-cloud-storage google-cloud-functions

我的功能是由云存储事件触发的,并且会将文件加载到BigQuery表中,我的问题是我们收到了一些同名的.zip文件,并且该函数也在尝试加载这些文件,这是导致的表的一些问题。我需要使代码只处理.csv文件。以下是我到目前为止的代码:

exports.ToBigQuery = (event, callback) => {
  const file = event.data;
  const context = event.context;

  const BigQuery = require('@google-cloud/bigquery');
  const Storage = require('@google-cloud/storage');

  const projectId = "gas-ddr";
  const datasetId = "gas_ddr";
  const bucketName = file.bucket;
  const filename = file.name;

  const dashOffset = filename.indexOf('-');
  const tableId = filename.substring(0, dashOffset);

  console.log(`Load ${filename} into ${tableId}.`);

 // Instantiates clients
  const bigquery = new BigQuery({
    projectId: projectId,
  });

  const storage = Storage({
    projectId: projectId,
  });

  const metadata = {
  allowJaggedRows: true,
  skipLeadingRows: 1

 };

  let job;

  // Loads data from a Google Cloud Storage file into the table
  bigquery
    .dataset(datasetId)
    .table(tableId)
    .load(storage.bucket(bucketName).file(filename),metadata)
    .then(results => {
      job = results[0];
      console.log(`Job ${job.id} started.`);

      // Wait for the job to finish
      return job;
    })
    .then(metadata => {
      // Check the job's status for errors
      const errors = metadata.status.errors;
      if (errors && errors.length > 0) {
        throw errors;
      }
    })
    .then(() => {
      console.log(`Job ${job.id} completed.`);
    })
    .catch(err => {
      console.error('ERROR:', err);
    });

  callback();
};

1 个答案:

答案 0 :(得分:3)

这只是一个与javascript相关的问题。您可以简单地提取文件名的扩展部分并相应地处理文件:

function getExtension(filename) {
    var parts = filename.split('.');
    return parts[parts.length - 1];
}


if (getExtension(filename) == "csv") {
  // Loads data from a Google Cloud Storage file into the table
  bigquery
     .dataset(datasetId)
  ...
}