具有嵌套模式的Google BigQuery REST API C#复制/导出表格为CSV

时间:2017-04-07 11:08:52

标签: c# rest api google-bigquery

有没有办法使用REST API作为CSV从Google BigQuery导出带有嵌套架构的整个表?

有一个使用非嵌套模式执行此操作的示例(https://cloud.google.com/bigquery/docs/exporting-data)。这适用于我表中的非嵌套列。以下是这部分的代码:

PagedEnumerable<TableDataList, BigQueryRow> result2 = client.ListRows(datasetId, result.Reference.TableId);
        StringBuilder sb = new StringBuilder();
        foreach (var row in result2)
        {
            sb.Append($"{row["visitorId"]}, {row["visitNumber"]}, {row["totals.hits"]}{Environment.NewLine}");
        }

        using (var stream = new MemoryStream(Encoding.UTF8.GetBytes(sb.ToString())))
        {
            var obj = gcsClient.UploadObject(bucketName, fileName, contentType, stream);
        }

在BQ中有像totals.hits,totals.visits这样的列...如果我尝试解决它们,我得到的错误消息是没有这样的列。如果我正在处理“总计”,我在我的csv中的行中得到对象名“System.Collections.Generic.Dictionary`2 [System.String,System.Object]”。

有没有可能做那样的事情?最后,我想把我在BQ中的GA表作为CSV的其他地方。

1 个答案:

答案 0 :(得分:0)

有可能。在下面的shema中选择你需要的每一列,并展平所有需要展平的列。

string query = $@"
#legacySQL
SELECT
  visitorId,
  visitNumber,
  visitId,
  visitStartTime,
  date,
  hits.hitNumber as hitNumber,
  hits.product.productSKU as product.productSKU
FROM 
  FLATTEN(FLATTEN({tableName},hits),hits.product)";

//Creating a job for the query and activating legacy sql

            BigQueryJob job = client.CreateQueryJob(query,
                new CreateQueryJobOptions { UseLegacySql = true });

            BigQueryResults queryResult = client.GetQueryResults(job.Reference.JobId,
                new GetQueryResultsOptions());

            StringBuilder sb = new StringBuilder();

//Getting the headers from the GA table and write them into the first row of the new table

            int count = 0;
            for (int i = 0; i <= queryResult.Schema.Fields.Count() - 1; i++)
            {
                string columenname = "";
                var header = queryResult.Schema.Fields[0].Name;
                if (i + 1 >= queryResult.Schema.Fields.Count)
                    columenname = queryResult.Schema.Fields[i].Name;
                else
                    columenname = queryResult.Schema.Fields[i].Name + ",";
                sb.Append(columenname);
            }

//Getting the data from the GA table and write them row by row into the new table

            sb.Append(Environment.NewLine);
            foreach (var row in queryResult.GetRows())
            {

                count++;
                if (count % 1000 == 0)
                    Console.WriteLine($"item {count} finished");
                int blub = queryResult.Schema.Fields.Count;
                for (Int64 j = 0; j < Convert.ToInt64(blub); j++)
                {
                    try
                    {
                        if (row.RawRow.F[Convert.ToInt32(j)] != null)
                            sb.Append(row.RawRow.F[Convert.ToInt32(j)].V + ",");

                    }
                    catch (Exception)
                    {

                    }
                }
                sb.Append(Environment.NewLine);

            }