bigQuery读取数据时出错,错误消息:第1行只有1列,而需要2列

时间:2018-06-01 12:42:19

标签: google-sheets google-drive-api google-bigquery

我创建了一个json文件,用于在Google bigQuery中定义一个链接到Google表格电子表格的表格:

{
  "autodetect": true, 
  "sourceFormat": "GOOGLE_SHEETS", 
  "sourceUris": [
    "https://docs.google.com/spreadsheets/d/1P1WH7cwVDaG6k-OQxKVXtnjBXI1NGFYvHD6IxCRFsZc"
  ],
  "maxBadRecords": 1,
  "googleSheetsOptions":
  {
    "range": "Sheet2!A1:B10",
    "skipLeadingRows": 0
  },
  "schema" : {
    "fields": [
{"name":"col3","type":"string"},
{"name":"col4","type":"string"}
    ]
  }
}

当我用这个bq命令行查询它时:

bq query --external_table_definition="Sheet2::/home/avilella/LIMS/test.json" --format=csv --use_legacy_sql=false 'SELECT * FROM Sheet2'

我收到此错误:

BigQuery error in query operation: Error processing job 'cegx-test-project1:bqjob_r30ad5155bcd0a174_00000163bb575bcf_1': Error while reading table: Sheet2, error message: Sheets table encountered too many
errors, giving up. Rows: 2; errors: 2. Please look into the error stream for more details.
Failure details:
- 1P1WH7cwVDaG6k-OQxKVXtnjBXI1NGFYvHD6IxCRFsZc: Error while reading
data, error message: Row 1 has only 1 columns, while 2 is needed.
- 1P1WH7cwVDaG6k-OQxKVXtnjBXI1NGFYvHD6IxCRFsZc: Error while reading
data, error message: Row 2 has only 1 columns, while 2 is needed.

任何想法我做错了什么?

2 个答案:

答案 0 :(得分:1)

我认为问题是Google表格中有空单元格。在SQliteDatabase中添加一些字符串后,我能够运行相同的命令。请注意,在Bigquery中加载配置有allowJaggedRows选项可接受丢失的尾随可选列,但它仅适用于CSV和this document关于Google表格阅读流程状态

  

省略空尾随行和列。

我认为这种情况的最佳解决方案是将空单元格替换为其他值。例如,' null'。

答案 1 :(得分:1)

从测试电子表格的Sheet2中,我看到第2行和第3行中只有1列,因此两者都是“坏”行,因为表模式有2个字段,如外部表def json中所指定。另外因为maxBadRecords设置为1,查询可以成功,最多只有1行,但由于你有两个坏行,查询失败了。