How to copy data from one table to another with nested required fields in repeatable objects

时间:2019-01-18 18:46:40

标签: google-bigquery

I'm trying to copy data from one table to another. The schemas are identical except that the source table has fields as nullable when they were meant to be required. Big query is complaining that the fields are null. I'm 99% certain the issue is that in many entries the repeatable fields are absent, which causes no issues when inserting into the table using our normal process.

The table I'm copying from used to have the exact same schema, but accidentally lost the required fields when recreating the table with a different partitioning scheme.

From what I can tell, there is no way to change the fields from nullable to required in an existing table. It looks to me like you must create a new table then use a select query to copy data.

I tried enabling "Allow large results" and unchecking "flatten results" but I'm seeing the same issue. The write preference is "append to table"

(Note: see edit below as I am incorrect here - it is a data issue) I tried building a query to better confirm my theory (and not that the records exist but are null) but I'm struggling to build a query. I can definitely see in the preview that having some of the repeated fields be null is a real use case, so I would presume that translates to the nested required fields also being null. We have a backup of the table before it was converted to the new partitioning, and it has the same required schema as the table I'm trying to copy into. A simple select count(*) where this.nested.required.field is null in legacy sql on the backup indicates that there are quite a few columns that fit this criteria.

SQL used to select for insert:

select * from my_table

Edit: When making a partition change on the table was also setting certain fields to a null value. It appears that somehow the select query created objects with all fields null rather than simply a null object. I used a conditional to set a nested object to either null or pick its existing value. Still investigating, but at this point I think what I'm attempting to do is normally supported, based on playing with some toy tables/queries.

1 个答案:

答案 0 :(得分:0)

当尝试从一个表复制到另一个表并使用SELECT AS STRUCT时,请运行空检查,如下所示:

IF(foo.bar is null, null, (SELECT AS STRUCT foo.bar.* REPLACE(...))

这可以防止null嵌套结构变成充满null值的结构。

要通过select语句修复它,请使用条件检查是否需要这样的值:

IF (bar.req is null, null, bar)

当然,真正的查询要复杂得多。好消息是,修复查询的外观应类似于弄乱格式的原始查询