我是SAS开发人员。我正在使用PROC SQL执行联合声明。 我的代码:
proc sql;
create table test3 as
select a.state
,a.station
,a.ca_no
,a.applicant_name
,a.capacity
,a.commission_date
,a.technology
,a.pmu
,a.ppu
,a.ssu_pe
,a.re_switch_no
,a.voltage
,a.vcb_brand_and_model
,a.scada_y_n
,a.gps_coordinate
,a.plant_manager_phone_number
,a.plant_manager_name
,a.plant_manager_email
,a.highest_md_recorded_a
,a.highest_md_recorded_kw
,a.total_energy_sold
%do c=1 %to 12;
,a.kwh_&&ALLDATES&c..
%end;
%do c=1 %to 12;
,a.gen_factor_&&ALLDATES&c..
%end;
,a.period
from test a
union all
select b.pss_no as ca_no
,b.applicant_name /*capacity_mw voltage technology*/
,b.program
,b.scod_date
,b.kick_off_date
from newresheet2 b;
quit;
如您所见,在表B中的重命名语句之后,两个表中只有ca_no。
我遇到了错误:
MPRINT(TRASPOSETRX):proc sql; MPRINT(TRASPOSETRX):创建表 test3作为选择a.state,a.station,a.ca_no,a.applicant_name ,a。容量,a。投产日期,a。技术,a.pmu,a.ppu,a.ssu_pe ,a.re_switch_no,a.voltage,a.vcb_brand_and_model,a.scada_y_n ,a.gps_coordinate,a.plant_manager_phone_number,a.plant_manager_name ,a.plant_manager_email,a.highest_md_recorded_a ,a.highest_md_recorded_kw,a.total_energy_sold,a.kwh_SEPT17 ,a.kwh_OCT17,a.kwh_NOV17,a.kwh_DEC17,a.kwh_JAN18,a.kwh_FEB18 ,a.kwh_MAR18,a.kwh_APR18,a.kwh_MAY18,a.kwh_JUN18,a.kwh_JULY18 ,a.kwh_AUG18,a.gen_factor_SEPT17,a.gen_factor_OCT17 ,a.gen_factor_NOV17,a.gen_factor_DEC17,a.gen_factor_JAN18 ,a.gen_factor_FEB18,a.gen_factor_MAR18,a.gen_factor_APR18 ,a.gen_factor_MAY18,a.gen_factor_JUN18,a.gen_factor_JULY18 ,a.gen_factor_AUG18,a.period from test a union都选择b.pss_no作为 ca_no,b.applicant_name,b.program,b.scod_date,b.kick_off_date来自 newresheet2 b;警告:表已扩展为具有空列 执行UNION ALL设置操作。错误:第一列的第5列 UNION ALL的贡献者与其对应的类型不同 第二个。
我在两个表中检查了ca_no的数据类型,并且两个都是字符。当我计算表A中的第五列(即容量)时,表B中没有称为容量的列。实际上,我从表B中注释掉了不相同名称的Capacity_MW。这是原因吗?
答案 0 :(得分:2)
CREATE TABLE test3 AS
SELECT
a.STATE
, a.station
, a.ca_no
, a.applicant_name
, a.capacity
--------- more than 5 -----------
, a.commission_date
, a.technology
, a.pmu
, a.ppu
, a.ssu_pe
, a.re_switch_no
, a.voltage
, a.vcb_brand_and_model
, a.scada_y_n
, a.gps_coordinate
, a.plant_manager_phone_number
, a.plant_manager_name
, a.plant_manager_email
, a.highest_md_recorded_a
, a.highest_md_recorded_kw
, a.total_energy_sold
, a.kwh_SEPT17
, a.kwh_OCT17
, a.kwh_NOV17
, a.kwh_DEC17
, a.kwh_JAN18
, a.kwh_FEB18
, a.kwh_MAR18
, a.kwh_APR18
, a.kwh_MAY18
, a.kwh_JUN18
, a.kwh_JULY18
, a.kwh_AUG18
, a.gen_factor_SEPT17
, a.gen_factor_OCT17
, a.gen_factor_NOV17
, a.gen_factor_DEC17
, a.gen_factor_JAN18
, a.gen_factor_FEB18
, a.gen_factor_MAR18
, a.gen_factor_APR18
, a.gen_factor_MAY18
, a.gen_factor_JUN18
, a.gen_factor_JULY18
, a.gen_factor_AUG18
, a.period
FROM test a
该部分的内容多于5列:以下部分的确包含5列:
UNION ALL
SELECT
b.pss_no AS ca_no
, b.applicant_name
, b.program
, b.scod_date
, b.kick_off_date
FROM newresheet2 b;
一个联合在每个子查询中需要相同数量的列,并且这些列中的每一列都必须具有“兼容”数据类型(例如,整数将进入小数列,而varchar将不会进入日期列)。
每个这些列对兼容吗?
SELECT
a.STATE
, a.station
, a.ca_no
, a.applicant_name
, a.capacity
FROM test a
UNION ALL
SELECT
b.pss_no AS ca_no
, b.applicant_name
, b.program
, b.scod_date
, b.kick_off_date
FROM newresheet2 b;
列的“对齐”不是通过列名/别名来实现的,而是通过select子句中的位置来实现,a.ca_no下的位置与b.pss_no对齐,而a.applicant_name的位置与b.applicant_name对齐
SELECT
a.ca_no
, a.applicant_name
FROM test a
UNION ALL
SELECT
b.pss_no AS ca_no
, b.applicant_name
FROM newresheet2 b;
答案 1 :(得分:0)
问题在于,UNION ALL
之前的查询比UNION ALL
运算符之后的查询具有更多的预计列数。
您需要确保两个查询选择的库仑数相同。
答案 2 :(得分:0)
解决此问题的最简单方法是使用set语句追加,它将使用列名而不是列位置进行追加
/*create table example of your first dataset. table with more columns*/
data class1;
set sashelp.class;
run;
/*create table example for your second dataset with fewer column and a different name*/
data class(keep = name age gender);
set sashelp.class;
gender =sex;
run;
/* append it using set statement using rename. same name column append together,
missing values for other columns where there is no match*/
data want;
set class1 class(rename = (gender=sex));
run;
您还可以使用proc append,也可以使用列名而不是列位置进行添加。