我有三个表:
profit
的主表; tax_a
的表格; tax_b
的表。 profit
已按ID,年和月进行累计,但未计税。我尝试使用下面的解决方案来做到这一点,它可以工作,但是非常慢。我如何才能更有效地解决这个问题?
proc sql;
create table final_table as
select t1.id, t1.year, t1.month, t1.profit,
(select sum(t2.tax_a) from work.table_tax_a t2
where ((t2.year = t1.year and t2.month <= t1.month) or (t2.year < t1.year)) and t2.id = t1.id) as tax_a,
(select sum(t3.tax_b) from work.table_tax_b t3
where ((t3.year = t1.year and t3.month <= t1.month) or (t3.year < t1.year)) and t3.id = t1.id) as tax_b
from work.main_table t1;
quit;
答案 0 :(得分:2)
这很慢,因为您正在对main_table中的每一行运行2个求和。如果您可以将其从联接中拉出并放入临时表中,则可以使其运行更快。
您的内部查询只是为每个ID随时间创建累加税额。
select sum(t2.tax_a)
from work.table_tax_a t2
where ((t2.year = t1.year and t2.month <= t1.month) or (t2.year < t1.year))
and t2.id = t1.id
(t2.year < t1.year)
意味着您多年来一直在这样做。如果是您的原因,请在SQL外部计算累计总和,然后将结果重新加入。
假设您的表格已排序by id year month
data temp_a;
set table_tax_a;
by id;
retain c_tax_a;
if first.id then c_tax_a = 0;
c_tax_a = c_tax_a + tax_a;
run;
执行此操作,以table_tax_b
创建temp_b
。然后将它们加入SQL;
proc sql noprint;
create table final_table2 as
select t1.id, t1.year, t1.month, t1.profit, t2.c_tax_a as tax_a, t3.c_tax_b as tax_b
from main_table as t1,
temp_a as t2,
temp_b as t3
where t1.id = t2.id
and t2.id = t3.id
and t1.month = t2.month
and t2.month = t3.month
and t1.year = t2.year
and t2.year = t3.year;
quit;
一些测试数据显示与您的方法相同的结果。我的SQL步骤需要0.03秒,而您的SQL步骤需要0.65秒。