条件插补SAS

时间:2015-04-06 13:47:47

标签: sas

我的数据是一系列学校及其在某些科目评估中的表现,以及参加课程的性别比例。我在下面创建了一个示例数据集:

data have;
    input school $ subject $ perc_male perc_female score similar_school $;
datalines;
X math 51 49 93 Y
X english 48 52 95 Y
X tech 60 40 90 Y
X science 57 43 92 Y
Y math . . 87 X
Y english . . 83 X
Y science . . 81 X
Y language . . 91 X
Z math 40 60 78 Z
Z english 50 50 76 Z
Z science 45 55 80 Z
;
run;

正如你所看到的,没有为Y学校收集性别百分比。研究表明,X学校的性别分布非常相似,所以我希望将特定学科的百分比从X归结为Y.另一个问题是Y有语言得分,而X没有接受这种评估。在这种情况下,我希望得到估算值(51,48,57)的平均值,得到52个男性语言课程学习者的百分比。

执行此操作将演示我想要的输出:

data want;
    input school $ subject $ perc_male perc_female score;
datalines;
X math 51 49 93 Y
X english 48 52 95 Y
X tech 60 40 90 Y
X science 57 43 92 Y
Y math 51 49 87 X
Y english 48 52 83 X
Y science 57 43 81 X
Y language 52 48 91 X
Z math 40 60 78 Z
Z english 50 50 76 Z
Z science 45 55 80 Z
;
run;

得到了一个downvote,所以添加我试图几乎把我带到我需要的地方。对于任何投票的人,我想知道你是否有任何建设性的反馈。谢谢!我想知道是否有办法将平均插补部分构建到我当前的片段中。另外,我在想可能有更有效的方法来做到这一点。任何帮助将不胜感激。

proc sql;
    select distinct cats("'",similar_school,"'") into :school_list separated by ','
    from have
    where perc_male=.;
quit;

proc sql;
    create table stuff as
    select similar_school as school, subject, perc_male, perc_female
    from have
    where school in (&school_list.);
quit;

proc sql;
    create table want2 as
    select a.school, a.subject, coalesce(a.perc_male,b.perc_male), coalesce(a.perc_female,b.perc_female), a.score, a.similar_school
    from have as a
    left join stuff as b
        on a.school=b.school and a.subject=b.subject
    ;
quit;

1 个答案:

答案 0 :(得分:1)

根据您预期的数据,简单的SQL可以解决您的问题。您可以先根据学校和类似的学校信息进行自我加入,然后合并perc_male& perc_female信息。这将照顾您的第一个问题..对于问题的第二部分,您可以计算每个学校的平均值和合并perc_male& perc_female信息与学校的平均值。看看下面的sql并告诉我它是否有帮助。

proc sql;
create table want as
select aa.school
     , aa.subject
     , coalesce(aa.perc_male, mean(aa.perc_male)) as perc_male
     , coalesce(aa.perc_female,mean(aa.perc_female)) as perc_female
     , score
     , similar_school
from (
        select a.school
             , a.subject
             , coalesce(a.perc_male ,b.perc_male) as perc_male
             , coalesce(a.perc_female,b.perc_female) as perc_female 
             , a.score 
             , a.similar_school
         from have as a
         left join have as b 
              on b.school=a.similar_school
             and a.subject=b.subject
      ) as aa
group by aa.school
;
quit;