如何从命令行使用BigQuery REST API?

时间:2017-07-14 16:49:36

标签: rest command-line google-bigquery

尝试向其中一个BigQuery REST API发出一个简单的GET请求会产生如下错误:

curl https://www.googleapis.com/bigquery/v2/projects/$PROJECT_ID/jobs/$JOBID

输出:

{
 "error": {
  "errors": [
   {
    "domain": "global",
    "reason": "required",
    "message": "Login Required",
    "locationType": "header",
    "location": "Authorization",
  ...

从命令行调用其中一个REST API的正确方法是什么,例如queryinsert API? API reference有一个“试用此API”,但示例不会直接转换为可以从命令行运行的内容。

1 个答案:

答案 0 :(得分:4)

作为免责声明,在使用命令行工作时,使用bq工具通常就足够了,或者对于更复杂的用例,BigQuery client libraries可以使用来自多种语言的BigQuery进行编程。有时,对REST API发出明确请求以查看某些API如何在较低级别工作仍然有用。

首先,确保您拥有installed the Google Cloud SDK。这应包括gcloudbq命令行工具。如果您还没有,请通过终端运行此命令来授权您的帐户:

gcloud auth login

这会提示您登录,然后为您提供可以粘贴到终端的访问代码。 (确切的过程可能会随着时间而变化)。

现在让我们使用BigQuery REST API尝试查询,调用jobs.query method。使用您自己的项目名称修改此脚本,您可以从the Google Cloud Console找到该项目名称,然后将脚本粘贴到终端中:

PROJECT="YOUR_PROJECT_NAME"
QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
REQUEST="{\"kind\":\"bigquery#queryRequest\",\"useLegacySql\":false,\"query\":$QUERY}"
echo $REQUEST | \
  curl -X POST -d @- -H "Content-Type: application/json" \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries

如果有效,您应该看到如下所示的输出:

{
 "kind": "bigquery#queryResponse",
 "schema": {
  "fields": [
   {
    "name": "x",
    "type": "INTEGER",
    "mode": "NULLABLE"
   },
   {
    "name": "y",
    "type": "STRING",
    "mode": "NULLABLE"
   }
  ]
 },
 "jobReference": {
  "projectId": "<your project ID>",
  "jobId": "<your job ID>"
 },
 "totalRows": "1",
 "rows": [
  {
   "f": [
    {
     "v": "1"
    },
    {
     "v": "foo"
    }
   ]
  }
 ],
 "totalBytesProcessed": "0",
 "jobComplete": true,
 "cacheHit": false
}

如果您尚未设置bq命令行工具,则可以使用终端中的bq init来执行此操作。完成后,您可以尝试使用它运行相同的查询:

bq query --use_legacy_sql=False "SELECT 1 AS x, 'foo' AS y;"

您还可以通过传递bq选项查看--apilog=工具发出的REST API请求:

bq --apilog= query --use_legacy_sql=False "SELECT [1, 2, 3] AS x;"

现在让我们尝试使用jobs.insert method代替query API的示例。运行此脚本,将YOUR_PROJECT_NAME替换为您的项目名称:

PROJECT="YOUR_PROJECT_NAME"
QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
REQUEST="{\"configuration\":{\"query\":{\"useLegacySql\":false,\"query\":${QUERY}}}}"
echo $REQUEST | \
curl -X POST -d @- -H "Content-Type: application/json" \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs

与立即返回响应的query API不同,您会看到与此类似的结果:

{
 "kind": "bigquery#job",
 "etag": "\"<etag string>\"",
 "id": "<project name>:<job ID>",
 "selfLink": "https://www.googleapis.com/bigquery/v2/projects/<project name>/jobs/<job ID>",
 "jobReference": {
  "projectId": "<project name>",
  "jobId": "<job ID>"
 },
 "configuration": {
  "query": {
   "query": "SELECT 1 AS x, 'foo' AS y;",
   "destinationTable": {
    "projectId": "<project name>",
    "datasetId": "<anonymous dataset>",
    "tableId": "<anonymous table>"
   },
   "createDisposition": "CREATE_IF_NEEDED",
   "writeDisposition": "WRITE_TRUNCATE",
   "useLegacySql": false
  }
 },
 "status": {
  "state": "RUNNING"
 },
 "statistics": {
  "creationTime": "<timestamp millis>",
  "startTime": "<timestamp millis>"
 },
 "user_email": "<your email address>"
}

注意状态:

 "status": {
  "state": "RUNNING"
 },

如果您想立即查看工作,可以使用jobs.get method。与之前类似,使用上一步输出中的作业ID从终端运行:

PROJECT="YOUR_PROJECT_NAME"
JOB_ID="YOUR_JOB_ID"
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs/$JOB_ID

如果查询完成,您将得到一个表示同样多的响应:

...
"status": {
 "state": "DONE"
},
...

最后,我们还可以使用REST API发出获取查询结果的请求。

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries/$JOB_ID

输出看起来与我们上面使用jobs.query方法的时间类似:

{
 "kind": "bigquery#getQueryResultsResponse",
 "etag": "\"<etag string>\"",
 "schema": {
  "fields": [
   {
    "name": "x",
    "type": "INTEGER",
    "mode": "NULLABLE"
   },
   {
    "name": "y",
    "type": "STRING",
    "mode": "NULLABLE"
   }
  ]
 },
 "jobReference": {
  "projectId": "<project ID>",
  "jobId": "<job ID>"
 },
 "totalRows": "1",
 "rows": [
  {
   "f": [
    {
     "v": "1"
    },
    {
     "v": "foo"
    }
   ]
  }
 ],
 "totalBytesProcessed": "0",
 "jobComplete": true,
 "cacheHit": true
}