Question

我正在研究一个分析项目，该项目需要我从Teradata中的一个很大的表中提取一些数据。这是我正在使用的查询：

select TransactionNumber
from my_table
where TransactionDate between date '2017-01-01' and date '2017-12-31'
and ItemNumber in (99276);

即使我在2017年全年都在过滤my_table，该查询仍然会产生近9亿行，并且该查询要花30秒钟多一点的时间才能运行。由于项目的性质，我希望它能在5秒钟或更短的时间内运行，但是鉴于表的大小，我什至不确定是否可行。如果有帮助，这是我使用“解释”时显示的内容：

1) First, we lock DBTables.my_table in view
DB.my_table for access.
2) Next, we do an all-AMPs RETRIEVE step from 365 partitions of
DBTables.my_table in view DB.my_table with a
condition of ("(DBTables.my_table in view
DB.my_table.TransactionDate <= DATE '2017-12-31') AND
((DBTables.my_table in view
DB.my_table.TransactionDate >= DATE '2017-01-01') AND
(DBTables.my_table in view
DB.my_table.ItemNumber = 99276 ))") into Spool
1 (group_amps), which is built locally on the AMPs. The size of
Spool 1 is estimated with no confidence to be 617,535,066 rows (
14,203,306,518 bytes). The estimated time for this step is 2
minutes and 48 seconds.
3) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 2 minutes and 48 seconds.

诚然，我对优化查询不是很熟悉，而且由于我不是DBA，所以我只能读取数据库。

从Teradata中的大型表中更快地检索结果

0 个答案: