BigQuery JSON导入内部错误?

时间:2012-10-23 13:21:54

标签: c#-4.0 google-bigquery

我遇到了将JSON导入BigQuery的问题。我们已经创建了服务帐户,并且正在使用自定义的.NET 4库来处理我们的服务器和BQ之间的所有对话。查询工作,工作列表工作,基本上所有提取工作,但通过JSON格式上传不起作用。

以下是启动的作业返回的内容:

{
 "kind": "bigquery#job",
 "etag": "\"WgwoVdnmFVq0E0riaWM5H0QXabs/R_b3J5b4GjwliMH_X8kjPNLVYsI\"",
 "id": "dot-metrics:job_f7eea1449bb24dffb0a0de1637f31abb",
 "selfLink": "https://www.googleapis.com/bigquery/v2/projects/dot-metrics/jobs/job_f7eea1449bb24dffb0a0de1637f31abb",
 "jobReference": {
  "projectId": "dot-metrics",
  "jobId": "job_f7eea1449bb24dffb0a0de1637f31abb"
 },
 "configuration": {
  "load": {
   "schema": {
    "fields": [
     {
      "name": "word",
      "type": "STRING",
      "mode": "REQUIRED"
     },
     {
      "name": "word_count",
      "type": "INTEGER",
      "mode": "REQUIRED"
     },
     {
      "name": "corpus",
      "type": "STRING",
      "mode": "REQUIRED"
     },
     {
      "name": "corpus_date",
      "type": "INTEGER",
      "mode": "REQUIRED"
     }
    ]
   },
   "destinationTable": {
    "projectId": "dot-metrics",
    "datasetId": "DotMetric_TEST",
    "tableId": "TestTable"
   },
   "writeDisposition": "WRITE_APPEND",
   "allowQuotedNewlines": true,
   "sourceFormat": "NEWLINE_DELIMITED_JSON"
  }
 },
 "status": {
  "state": "DONE",
  "errorResult": {
   "reason": "internalError",
   "message": "Backend error. Job aborted."
  }
 },
 "statistics": {
  "startTime": "1350998303355",
  "endTime": "1350998337446",
  "load": {
   "inputFiles": "1",
   "inputFileBytes": "7359"
  }
 }
}

数据是JSON换行符分隔的字符串,如下所示:

{"Word":"blah_139","WordCount":6615,"Corpus":"Corpus_678","CorpusDate": 6088201915056}
{"Word":"blah_602","WordCount":2978,"Corpus":"Corpus_493","CorpusDate": 6088201915056}
{"Word":"blah_50","WordCount":8315,"Corpus":"Corpus_360","CorpusDate": 6088201915056}
{"Word":"blah_736","WordCount":8971,"Corpus":"Corpus_751","CorpusDate": 6088201915056}
{"Word":"blah_243","WordCount":2362,"Corpus":"Corpus_174","CorpusDate": 6088201915056}
{"Word":"blah_643","WordCount":765,"Corpus":"Corpus_315","CorpusDate": 6088201915056}

Job正在运行一段时间(大约10秒),但随后就死了。请帮忙!

1 个答案:

答案 0 :(得分:0)

好的,看起来你复制了莎士比亚样本表并附加到它上面。莎士比亚的示例模式,因为它是使用谷歌内部源数据中较旧版本的bigquery导入的,它的架构有一些瑕疵。当我们导入它时,这些疣会引起你的问题(具体来说,我们认为corpus_date字段应该是int32字段而不是int64,即使bigquery仅支持int32用于新数据)。

如果您执行write_truncate而不是追加并传递新架构,或者导入到新表,则不应该出现此问题。