Google-Bigquery:查询会扫描整个表,而不是分区表中的给定范围

时间:2018-08-30 12:27:40

标签: google-bigquery

我有一个事实表,其中包含5000万条名为AccountLines的记录,并按Posting_Date_New进行了分区。当我过滤特定分区列上的记录时,我的查询可以正常工作,并且仅扫描给定范围之间的有限数据。但是,当我基于Posting_Date_New列与Dimension表建立联接并在Financial Year上进行过滤时,它将扫描整个表。我怎么解决这个问题?我需要将事实表与维表连接起来,并在维表的列上进行过滤,而无需扫描整个表。请帮忙。

我的查询如下。

-查询完成(经过5.542秒,已处理244.37 MB)

select ah.ChargeGroup, sum(Amount) Amount from SSIS_STAGING.AccountLine acc
inner join SSIS_STAGING.Dim_Times_BI_Clustering dd on dd.Posting_Date_New = acc.Posting_Date_New
inner join SSIS_STAGING.BranchHierarchy br on br.CostCenterId = acc.BookingBranchID
inner join SSIS_STAGING.Accounts_Hierarchy ah on ah.Account = acc.G_L
where acc.Posting_Date_New between '2018-04-01' and '2019-03-31' and ZoneName = 'BU-North'
group by ah.ChargeGroup

-查询完成(经过16.530秒,已处理5.51 GB)

select ah.ChargeGroup, sum(Amount) Amount from SSIS_STAGING.AccountLine acc
inner join SSIS_STAGING.Dim_Times_BI_Clustering dd on dd.Posting_Date_New = acc.Posting_Date_New
inner join SSIS_STAGING.BranchHierarchy br on br.CostCenterId = acc.BookingBranchID
inner join SSIS_STAGING.Accounts_Hierarchy ah on ah.Account = acc.G_L
where dd.FinancialYear = '2018-19' and ZoneName = 'BU-North'
group by ah.ChargeGroup

1 个答案:

答案 0 :(得分:0)

prune partitioned table,您需要在where子句中使用时间戳记:

standardSQL

选择   t1.name,   t2。类别 从   表1 t1 内部联接   table2 t2 开启t1.id_field = t2 field2 哪里   t1.ts = CURRENT_TIMESTAMP()

这被称为BigQuery is performs better with denormalize数据。