我正在编写一个SQL查询评估程序作为我的大学项目的一部分,其中一个示例查询具有以下性质:
select suppnation, custnation, sum(volume) as revenue
from ( select n1.name as suppnation, n2.name as custnation,
lineitem.extendedprice * (1 - lineitem.discount) as volume
from supplier, lineitem, orders, customer, nation n1, nation n2
where supplier.suppkey = lineitem.suppkey
and orders.orderkey = lineitem.orderkey
and customer.custkey = orders.custkey
and supplier.nationkey = n1.nationkey
and customer.nationkey = n2.nationkey
and ( ((n1.n_name = 'FRANCE') and (n2.n_name = 'GERMANY') )
or ((n1.n_name = 'GERMANY') and (n2.n_name = 'FRANCE') ) )
and lineitem.shipdate >= date('1995-01-01')
and lineitem.shipdate <= date('1996-12-31') ) as shipping
group by suppnation, custnation
order by suppnation, custnation;
from子句中的每个表都表示为由“|”组成的文本文件每条记录的分隔值,“|”分隔值表示表中不同字段的值。文本文件的大小约为2GB,您可以为每个文本文件假设这个大小。问题是我想构建索引来优化搜索值和范围查询所花费的时间,如上例所示。如何解决上述构建索引的问题,以便有效地检索相关记录。我正在使用 JSQLParser 来解析select语句的相关部分。