我有一个非常类似于下面的表格,除了v和p值都有20个。
|--------------Table 1-------------|
| part_id | v1 | v2 | p1 | p2 |
| 1 | 250 | 8 | 1 | 2 |
| 2 | 1348 | 9 | 28 | 88 |
| 4094 | 580 | 230 | 207 | 726 |
| 7111 | 12 | 14 | 223 | 195 |
我需要将此表与另外两个包含维度信息的表联系起来。
|----Values----| |-------Parameters------|
| v_id | value | | p_id | description |
| 8 | 1 | | 1 | 'Weight (lbs)' |
| etc... | | etc... |
当前节目:
proc sql;
create table table2 as
select t1.part_id
,t1.v1
,val.value
,t1.v2
,val1.value
,t1.p1
,par.description
,t1.p2
,par1.description
from table_1 t1
inner join values val
on val.v_id = t1.v1
inner join values val1
on val1.v_id = t1.v2
inner join parameters par
on par.p_id = t1.p1
inner join parameters par1
on par1.p_id = t1.p2;
quit;
有没有办法在不使用40个内部联接的情况下将这些表连接在一起?
答案 0 :(得分:2)
从值/参数数据集创建格式,然后使用数据集中的数组循环并创建所需的描述。这是一个可以帮助您入门的参数表示例。我没有测试过这段代码:)。
data param_fmt;
fmtname='param_fmt';
start=p_id;
label=value;
run;
proc format cntlin=param_fmt;
run;
data want;
set have;
array p(*) p1-p3;
array p_desc(*) p_desc1-p_desc3;
do i=1 to dim(p);
p_desc(i) = put(p(i), param_fmt.);
end;
run;
答案 1 :(得分:2)
如果维度表很简单,那么将它们转换为格式。
data cntlin ;
set values ;
by v_id;
retain fmtname 'VALUES';
rename v_id = start value = label;
run;
proc format cntlin=cntlin;
run;
然后您甚至不需要修改输入表。您可以按原样使用它,只需附加格式以打印值而不是ids。
proc print data=table_1 ;
format v1-v2 values. p1-p2 parameters. ;
run;
如果维度表包含两列以上,则将每个额外列转换为另一种格式。
如果尺寸对于格式而言太大,请尝试在set语句上使用KEY =选项来查找尺寸值。
data want ;
set table_1 ;
array _v v1-v2 ;
array _p p1-p2 ;
array values(2) ;
array descriptions (2) $50 ;
do _n_=1 to dim(_v);
v_id=_v(_n_) ;
if not (missing(v_id)) then set values key=v_id;
values(_n_) = value ;
p_id=_p(_n_) ;
if not (missing(p_id)) then set parameters key=p_id;
parameters(_n_) = parameter ;
output;
call missing(of value parameter);
end;
您还可以将TABLE_1转换为高而不是宽格式。那你就不需要这么多连接了。
data tall ;
set table_1 ;
array _v v1-v2 ;
array _p p1-p2 ;
do col=1 to dim(_v);
v_id=_v(col) ;
p_id=_p(col) ;
output;
end;
drop v1-v2 p1-p2 ;
run;
proc sql ;
create table_2 as
select a.*
, v.value
, p.description
from tall a
left join values v on a.v_id = v.v_id
left join parameters p on a.p_id = p.description
;
quit;
答案 2 :(得分:1)
首先,呃,这是一个多么糟糕的设计!我以为你会坚持这个?如果是这样,你会得到我最真诚的同情。如果没有,请改变它,如果可以的话!
无论如何,既然你有每种类型的列的固定数量(8,你说过),你总是可以尝试将表格拆分为一个值/参数id列,进行连接然后重新编译成单行,就像这样:
with table1 as (select 1 part_id, 250 v1, 8 v2, 1 p1, 2 p2 from dual union all
select 2 part_id, 1348 v1, 9 v2, 28 p1, 88 p2 from dual union all
select 4094 part_id, 580 v1, 230 v2, 207 p1, 726 p2 from dual union all
select 7111 part_id, 12 v1, 14 v2, 223 p1, 195 p2 from dual),
vals as (select 250 v_id, 1 value from dual union all
select 1348 v_id, 2 value from dual union all
select 580 v_id, 3 value from dual union all
select 12 v_id, 4 value from dual union all
select 8 v_id, 5 value from dual union all
select 9 v_id, 6 value from dual union all
select 230 v_id, 7 value from dual union all
select 14 v_id, 8 value from dual),
params as (select 1 p_id, 'Weight (lbs)' description from dual union all
select 28 p_id, 'Weight (kgs)' description from dual union all
select 207 p_id, 'Length (ins)' description from dual union all
select 223 p_id, 'Length (cm)' description from dual union all
select 2 p_id, 'Time (secs)' description from dual union all
select 88 p_id, 'Time (mins)' description from dual union all
select 726 p_id, 'Speed (mph)' description from dual union all
select 195 p_id, 'Speed (kmph)' description from dual),
t1 as (select part_id,
id,
v_id,
p_id
from table1
unpivot ((v_id, p_id) for id in ((v1, p1) as 1,
(v2, p2) as 2))),
res as (select t1.part_id,
t1.id,
t1.v_id,
v.value,
t1.p_id,
p.description
from t1
inner join vals v on t1.v_id = v.v_id
inner join params p on t1.p_id = p.p_id)
select part_id,
"1_V" v1,
"1_VAL" val1,
"2_V" v2,
"2_VAL" val2,
"1_P" p1,
"1_DESCR" descr1,
"2_P" p2,
"2_DESCR" descr2
from res
pivot (max(v_id) as v,
max(value) as val,
max(p_id) as p,
max(description) as descr
for id in (1, 2));
PART_ID V1 VAL1 V2 VAL2 P1 DESCR1 P2 DESCR2
---------- ---------- ---------- ---------- ---------- ---------- ------------ ---------- ------------
1 250 1 8 5 1 Weight (lbs) 2 Time (secs)
2 1348 2 9 6 28 Weight (kgs) 88 Time (mins)
4094 580 3 230 7 207 Length (ins) 726 Speed (mph)
7111 12 4 14 8 223 Length (cm) 195 Speed (kmph)
另一个替代方案,如果你有很多重复的id可能会更好(这样你可以利用子查询缓存)就是简单地将子查询放在选择列表中,如下所示:
with table1 as (select 1 part_id, 250 v1, 8 v2, 1 p1, 2 p2 from dual union all
select 2 part_id, 1348 v1, 9 v2, 28 p1, 88 p2 from dual union all
select 4094 part_id, 580 v1, 230 v2, 207 p1, 726 p2 from dual union all
select 7111 part_id, 12 v1, 14 v2, 223 p1, 195 p2 from dual),
vals as (select 250 v_id, 1 value from dual union all
select 1348 v_id, 2 value from dual union all
select 580 v_id, 3 value from dual union all
select 12 v_id, 4 value from dual union all
select 8 v_id, 5 value from dual union all
select 9 v_id, 6 value from dual union all
select 230 v_id, 7 value from dual union all
select 14 v_id, 8 value from dual),
params as (select 1 p_id, 'Weight (lbs)' description from dual union all
select 28 p_id, 'Weight (kgs)' description from dual union all
select 207 p_id, 'Length (ins)' description from dual union all
select 223 p_id, 'Length (cm)' description from dual union all
select 2 p_id, 'Time (secs)' description from dual union all
select 88 p_id, 'Time (mins)' description from dual union all
select 726 p_id, 'Speed (mph)' description from dual union all
select 195 p_id, 'Speed (kmph)' description from dual)
select t1.part_id,
t1.v1,
(select value from vals v where v.v_id = t1.v1) val1,
t1.v2 v2,
(select value from vals v where v.v_id = t1.v2) val2,
t1.p1 p1,
(select description from params p where p.p_id = t1.p1) descr1,
t1.p2 p2,
(select description from params p where p.p_id = t1.p2) descr2
from table1 t1;
PART_ID V1 VAL1 V2 VAL2 P1 DESCR1 P2 DESCR2
---------- ---------- ---------- ---------- ---------- ---------- ------------ ---------- ------------
1 250 1 8 5 1 Weight (lbs) 2 Time (secs)
2 1348 2 9 6 28 Weight (kgs) 88 Time (mins)
4094 580 3 230 7 207 Length (ins) 726 Speed (mph)
7111 12 4 14 8 223 Length (cm) 195 Speed (kmph)
哪种方式更适合您,完全取决于您的数据集。一如既往,您应该彻底测试每个解决方案!