我有一张桌子,其中包含每个学生两个学期选择的课程信息。这些学生没有验证他们的第一个学期,所以valid_or_not_of_semester='N'
的所有semester='1st'
:
student semester course_selected valid_or_not_of_semester
A 1st math N
A 1st english N
A 2nd math Y
A 2nd english Y
B 1st math N
B 2nd math Y
B 2nd english Y
C 1st math N
C 2nd math N
对于在第一学期选择math
(或english
)的学生,我想研究他们是否在第二学期选择math
(或english
)学期,如果是的话,我打算创建一个交叉制表,计算那些在第二学期验证与否的学生人数:
--------------------------------------------------------------------------
1st semester \ 2nd semester | Math | English
invalid \ |---------------------|--------------------
students \ | valid | invalid | valid | invalid
--------------------------------------------------------------------------
Math | 2 | 1 | 2 | 0
--------------------------------------------------------------------------
English | 1 | 0 | 1 | 0
--------------------------------------------------------------------------
每行代表未完成第一学期验证且在第一学期选择课程的学生人数。并且专栏将选择课程的学生分为有效和无效的第二学期。更确切地说,
--------------------------------------------------------------------------
1st semester \ 2nd semester | Math | English
invalid \ |---------------------|--------------------
students \ | valid | invalid | valid | invalid
--------------------------------------------------------------------------
Math | 2 | 1 | 2 | 0
| | |
\ / \ / \ /
(students A&B) (student C) (students A&B)
我试过proc sql:
data math;
merge have
have (where=(semester='1st') in=these);
by student;
if these then output;
run;
proc sql;
create table result as
select count(distinct student) as nb_student
from math (where=(semester='2nd'))
group by course_selected, valid_or_not_of_semester;
quit;
为english
做同样的事情。
但有没有办法直接获得2门课程的成绩?我怎么能使用proc freq?
希望得到你的答案。
答案 0 :(得分:1)
这并不能准确地为您提供所需的表格,但它会生成您感兴趣的值。我们的想法是转置原始数据集,然后计算观察结果。
您可能还想查看proc tabulate,但您可能会遇到问题,因为您在某些情况下会对学生进行重复计算。
data temp;
input student $ semester $ course_selected $ valid_or_not_of_semester $;
datalines;
A 1st math N
A 1st english N
A 2nd math Y
A 2nd english Y
B 1st math N
B 2nd math Y
B 2nd english Y
C 1st math N
C 2nd math N
;
proc sort; by student;
run;
proc transpose data = temp out = temp2;
by student;
id course_selected semester;
var valid_or_not_of_semester;
run;
proc sql;
create table temp3 as select distinct
sum(case when math1st = "N" and math2nd = "Y" then 1 else 0 end) as math_math_valid,
sum(case when math1st = "N" and math2nd = "N" then 1 else 0 end) as math_math_invalid,
sum(case when english1st = "N" and math2nd = "Y" then 1 else 0 end) as english_math_valid,
sum(case when english1st = "N" and math2nd = "N" then 1 else 0 end) as english_math_invalid,
sum(case when math1st = "N" and english2nd = "Y" then 1 else 0 end) as math_english_valid,
sum(case when math1st = "N" and english2nd = "N" then 1 else 0 end) as math_english_invalid,
sum(case when english1st = "N" and english2nd = "Y" then 1 else 0 end) as english_english_valid,
sum(case when english1st = "N" and english2nd = "N" then 1 else 0 end) as english_english_invalid
from temp2;
quit;