BigQuery一天的分区到期时间将返回2天的分区数据

时间:2018-09-10 10:43:01

标签: google-bigquery

我有一个分区有效期为1天的表:

[tim@Timothys-MBP] data-exporter $ bq ls --format=pretty data_export
+-----------------------+-------+--------+------------------------------+
|        tableId        | Type  | Labels |      Time Partitioning       |
+-----------------------+-------+--------+------------------------------+
| test .                | TABLE |        | DAY (expirationMs: 86400000) |
+-----------------------+-------+--------+------------------------------+

我有一项Cron工作,每天早上UTC凌晨3点更新此表。 当我查询该表时,我只希望获得最后几天的数据,但是会查询今天和昨天的分区。

[tim@Timothys-MBP] data-exporter $ bq --location EU  query --use_legacy_sql=false 'SELECT COUNT(Id) FROM `proj.data_export.test`'
Waiting on bqjob_*** ... (0s) Current status: DONE
+------+
| f0_  |
+------+
| 2885 |
+------+
[tim@Timothys-MBP] data-exporter $ bq --location EU  query --use_legacy_sql=false 'SELECT COUNT(Id) FROM `proj.data_export.test` WHERE _PARTITIONTIME = TIMESTAMP("2018-09-10")'
Waiting on bqjob_*** ... (2s) Current status: DONE
+------+
| f0_  |
+------+
| 1447 |
+------+
[tim@Timothys-MBP] data-exporter $ bq --location EU  query --use_legacy_sql=false 'SELECT COUNT(Id) FROM `proj.data_export.test` WHERE _PARTITIONTIME = TIMESTAMP("2018-09-09")'
Waiting on bqjob_*** ... (0s) Current status: DONE
+------+
| f0_  |
+------+
| 1438 |
+------+
[tim@Timothys-MBP] data-exporter $ bq --location EU  query --use_legacy_sql=false 'SELECT COUNT(Id) FROM `proj.data_export.test` WHERE _PARTITIONTIME = TIMESTAMP("2018-09-08")'
Waiting on bqjob_*** ... (0s) Current status: DONE
+------+
| f0_  |
+------+
| 1434 |
+------+

我应该将什么设置为仅查询最新数据的分区有效期?

1 个答案:

答案 0 :(得分:0)

我能够在特定的分区表上运行查询,例如:

bq query --use_legacy_sql=false 'SELECT  COUNT(id) FROM `dataset_name.table_name` WHERE _PARTITIONTIME = TIMESTAMP("2017-12-11 00:00:00 UTC")'

请注意紧随日期之后的 00:00:00 UTC