我遇到了将JSON导入BigQuery的问题。我们已经创建了服务帐户,并且正在使用自定义的.NET 4库来处理我们的服务器和BQ之间的所有对话。查询工作,工作列表工作,基本上所有提取工作,但通过JSON格式上传不起作用。
以下是启动的作业返回的内容:
{
"kind": "bigquery#job",
"etag": "\"WgwoVdnmFVq0E0riaWM5H0QXabs/R_b3J5b4GjwliMH_X8kjPNLVYsI\"",
"id": "dot-metrics:job_f7eea1449bb24dffb0a0de1637f31abb",
"selfLink": "https://www.googleapis.com/bigquery/v2/projects/dot-metrics/jobs/job_f7eea1449bb24dffb0a0de1637f31abb",
"jobReference": {
"projectId": "dot-metrics",
"jobId": "job_f7eea1449bb24dffb0a0de1637f31abb"
},
"configuration": {
"load": {
"schema": {
"fields": [
{
"name": "word",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "word_count",
"type": "INTEGER",
"mode": "REQUIRED"
},
{
"name": "corpus",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "corpus_date",
"type": "INTEGER",
"mode": "REQUIRED"
}
]
},
"destinationTable": {
"projectId": "dot-metrics",
"datasetId": "DotMetric_TEST",
"tableId": "TestTable"
},
"writeDisposition": "WRITE_APPEND",
"allowQuotedNewlines": true,
"sourceFormat": "NEWLINE_DELIMITED_JSON"
}
},
"status": {
"state": "DONE",
"errorResult": {
"reason": "internalError",
"message": "Backend error. Job aborted."
}
},
"statistics": {
"startTime": "1350998303355",
"endTime": "1350998337446",
"load": {
"inputFiles": "1",
"inputFileBytes": "7359"
}
}
}
数据是JSON换行符分隔的字符串,如下所示:
{"Word":"blah_139","WordCount":6615,"Corpus":"Corpus_678","CorpusDate": 6088201915056}
{"Word":"blah_602","WordCount":2978,"Corpus":"Corpus_493","CorpusDate": 6088201915056}
{"Word":"blah_50","WordCount":8315,"Corpus":"Corpus_360","CorpusDate": 6088201915056}
{"Word":"blah_736","WordCount":8971,"Corpus":"Corpus_751","CorpusDate": 6088201915056}
{"Word":"blah_243","WordCount":2362,"Corpus":"Corpus_174","CorpusDate": 6088201915056}
{"Word":"blah_643","WordCount":765,"Corpus":"Corpus_315","CorpusDate": 6088201915056}
Job正在运行一段时间(大约10秒),但随后就死了。请帮忙!
答案 0 :(得分:0)
好的,看起来你复制了莎士比亚样本表并附加到它上面。莎士比亚的示例模式,因为它是使用谷歌内部源数据中较旧版本的bigquery导入的,它的架构有一些瑕疵。当我们导入它时,这些疣会引起你的问题(具体来说,我们认为corpus_date字段应该是int32字段而不是int64,即使bigquery仅支持int32用于新数据)。
如果您执行write_truncate而不是追加并传递新架构,或者导入到新表,则不应该出现此问题。