如何从云存储中将Avro文件加载到Google Cloud Datalab中的BigQuery表中?

时间:2018-02-07 15:06:40

标签: google-bigquery google-cloud-datalab

将BigQuery表导出到云端存储Avro文件成功

table = bq.Table(bq_table_ref)
job_id = table.load(source=gs_object, source_format='avro')

将云存储Avro文件导入BigQuery表时失败

%%bq load -m create -f avro -p gs_object -t bq_table_ref

并且替代方案失败以及

import re
file = open('pg23620.txt', 'r',encoding="utf8")
# .lower() returns a version with all upper case characters replaced with lower case characters.
text = file.read().lower()
file.close()
# replaces anything that is not a lowercase letter, a space, or an apostrophe with a space:
text = re.sub('[^a-z\ \']+', " ", text)
words = list(text.split())
print(*words, sep='\n')
f = open("List.txt","a")
f.writelines(*words,sep='\n')
f.close()

是否在google.datalab.bigquery中加载了不支持Avro文件?

1 个答案:

答案 0 :(得分:2)

根据google.datalab.bigquery docs,数据的格式应为'csv'或'json'。默认为'csv'。您不能使用'avro'格式。