BigQuery Legacy SQL - 如何插入具有嵌套字段的表?

时间:2017-01-04 02:31:59

标签: google-bigquery

我试图将记录插入到具有嵌套和重复字段的表中。我知道STRUCT和ARRAY关键字可以分别在标准SQL中使用。

Legacy SQL中的STRUCT和ARRAY关键字相当于将记录插入嵌套和重复字段中吗?

1 个答案:

答案 0 :(得分:2)

我正在重复您在bq command line tool - How to insert into Big query tables that has nested fields?

中提供的示例

请尝试以下操作,它适用于旧版SQL,并使用内联版本Javascript UDF进行旧版SQL 注意:默认情况下,BigQuery Legacy SQL会对任何结果进行展平,因此请确保设置目标表并将Allow Large Results设置为true(或在Web UI中检查)并将Flatten Results设置为false(或在Web UI中取消选中它) )

SELECT Employee_id, Name, Age, Department.*, Location.* FROM JS((
  SELECT Employee_id, Name, Age, Department_id, Department_Name, Department_Code, e.Location_id AS Location_id, Country,  State,  City 
  FROM (SELECT e.Employee_id AS Employee_id, e.Name AS Name, e.Age AS Age,
      e.Department_id AS Department_id, d.Department_Name AS Department_Name, d.Department_Code AS Department_Code, e.Location_id AS Location_id
    FROM Employee e JOIN Department d ON e.Department_id = d.Department_id ) AS e
  JOIN Location l ON e.Location_id = l.Location_id
),
// input columns
Employee_id,  Name, Age,  Department_id, Department_Name, Department_Code,  Location_id,  Country,  State,  City,  
// output schema
"[
  {'name': 'Employee_id', 'type': 'INTEGER', 'mode': 'NULLABLE'},
  {'name': 'Name', 'type': 'STRING', 'mode': 'NULLABLE'},
  {'name': 'Age', 'type': 'INTEGER', 'mode': 'NULLABLE'},
  {'name': 'Department', 'type': 'RECORD', 'mode': 'NULLABLE', 'fields': [
      {'name': 'Department_id', 'type': 'STRING', 'mode': 'NULLABLE'},
      {'name': 'Department_Name', 'type': 'STRING', 'mode': 'NULLABLE'},
      {'name': 'Department_Code', 'type': 'STRING', 'mode': 'NULLABLE'}
    ]},
  {'name': 'Location', 'type': 'RECORD', 'mode': 'NULLABLE', 'fields': [
      {'name': 'Location_id', 'type': 'STRING', 'mode': 'NULLABLE'},
      {'name': 'Country', 'type': 'STRING', 'mode': 'NULLABLE'},
      {'name': 'State', 'type': 'STRING', 'mode': 'NULLABLE'},
      {'name': 'City', 'type': 'STRING', 'mode': 'NULLABLE'}
    ]}
]",
// function
"function(r, emit){
  emit({
    Employee_id: r.Employee_id, Name: r.Name, Age: r.Age,
    Department: {Department_id:r.Department_id, Department_Name:r.Department_Name, Department_Code:r.Department_Code}, 
    Location: {Location_id:r.Location_id, Country:r.Country, State:r.State, City:r.City}
  });
}"
)   

请注意:我在这里使用的是在线版本的UDF,以便于显示和测试。不建议使用内联版本,也不提供官方支持。但您可以轻松将其转换为支持的版本 - 请参阅User-Defined Functions in Legacy SQL

的详细信息

P.S。即使上面的工作并且在标准SQL是一个选项之前帮助很多 - 现在你使用传统SQL的主要原因是标准SQL更优雅,并且给你更大的灵活性,特别是在处理嵌套和重复时字段