尝试从MySQL导入CSV文件时遇到Big Query问题。已使用以下选项导出这些文件:
SELECT <some collunms>
FROM <my table>
INTO OUTFILE 'export-20160411.csv'
CHARACTER SET 'utf8'
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '\"'
ESCAPED BY '\\';
问题是如何为BQ转义"
(双引号),这是一个CSV示例:
field01, field02, field03, field04
"xxx \" xxx \\", \N, "xxx", "xxx"
导致问题。所以BQ给出了这样的错误:
BigQuery error in load operation: Error processing job
'<project>:bqjob_r7269aea2ac9eae3c_0000015c88a9049d_1': Too many
errors encountered.
Failure details:
- file-00000000: Too many values in row starting at position:
2052615.
和此:
- mediaupload-snapshot: Error detected while parsing row starting at
position: 561497. Error: Missing close double quote (") character.
最后,我的问题是:导出CSV的最佳方式是什么,以便BQ可以毫无问题地导入它?
提前致谢。
格式:
field01, field02, field03, field04
"xxx "" xxx \\", \N, "xxx", "xxx"
在字符串中使用""
而不是\"
。但我不知道如何以这种方式从MySQL导出。
答案 0 :(得分:1)
其中一种方法是加载原始csv文件,就好像它只有一列(整行只有一列)而不是 - 在BigQuery端进行解析
在下面的示例中 - 假设CSVtable是您使用该CSV文件加载的表,如下所示:
oneField
"xxx "" xxx \\", \N, "xxx", "xxx"
所以&#34;解析&#34;可以如下所示:
#standardSQL
WITH CSVtable AS (
SELECT '''"xxx "" xxx \\\\", \\N, "xxx", "xxx"''' AS oneField
)
SELECT
SPLIT(oneField)[OFFSET(0)] AS field01,
SPLIT(oneField)[OFFSET(1)] AS field02,
SPLIT(oneField)[OFFSET(2)] AS field03,
SPLIT(oneField)[OFFSET(3)] AS field04
FROM CSVtable
此类查询的输出为
field01 field02 field03 field04
"xxx "" xxx \\" \N "xxx" "xxx"