我正在尝试提供代码,该代码将从感兴趣的一组列中选择一个随机列。列组将根据每个观察的列中的值而更改。每次观察都是一个主题。
让我更清楚地解释一下:
我有8列,名称为V1-V8。每列有3个潜在的回复(' Small'' Medium'' High')。由于我们项目中的某些情况,我需要"结合"所有这些信息分为1栏。
关键因素1:我们只想要他/她选择的每个主题的列'高' (这里有很多组合)。当我说每个主题的兴趣列发生变化时,这就是我所指的。
关键因素2:一旦我确定了哪些列高'高'为主题选择,随机选择其中一列。
最后,我需要一个新变量(New_V),其值为V1-V8(不是'小','中''高')表示为每个主题选择了哪一列。
任何建议都会很棒。我尝试过ARRAYs和Macro变量,但我似乎能以正确的方式解决这个问题。
答案 0 :(得分:1)
您使用数组进入了正确的轨道。 vname
功能在这里会很有用。 want
datastep显示了如何执行此操作(其余只是设置示例数据):
proc format;
value smh
1='Small'
2='Medium'
3='High'
other=' '
;
quit;
data have;
call streaminit(5);
array v[8] $;
do _i = 1 to 1000;
do _j = 1 to 8;
__rand = ceil(1+rand('Binomial',.7,2));
v[_j] = put(__rand,smh6.);
end;
if whichc('High',of v[*]) = 0 then v8 = 'High'; *guarantee have one high;
output;
end;
drop _:;
run;
data want;
call streaminit(7); *arbitrary seed here, pick any positive number;
set have;
array v[8] ;
do until (v[_rand] = 'High'); *repeat this loop until one is picked that is High;
_rand = ceil(8*rand('Uniform'));
end;
chosen_v = vname(v[_rand]); *assign the chosen name to chosen_v variable;
drop _:;
run;
proc freq data=want;
tables chosen_v;
run;
答案 1 :(得分:1)
此方法使用宏变量和循环。主要有三个步骤:首先,找到所有“高”的变量。其次,选择从1到“高”变量数的随机值。第三,选择该变量并将其命名为selected_var。
data temp;
input subject $ v1 $ v2 $ v3 $ v4 $ v5 $ v6 $ v7 $ v8 $;
datalines;
1 high medium small high medium small high medium
2 medium small high medium small high medium high
3 small high high medium small high medium high
4 medium medium high medium small small medium medium
5 medium medium high small small high medium small
6 small small high medium small high high high
7 small small small small small small small small
8 high high high high high high high high
;
run;
%let vars = v1 v2 v3 v4 v5 v6 v7 v8;
%macro find_vars;
data temp2;
set temp;
/*find possible variables*/
format possible_vars $20.;
%do i = 1 %to %sysfunc(countw(&vars.));
%let this_var = %scan(&vars., &i.);
if &this_var. = "high" then possible_vars = cats(possible_vars, "&this_var.");
%end;
/*create a random integer between 1 and number of variables to select from*/
rand = 1 + floor((length(possible_vars) / 2) * rand("Uniform"));
/*pick that one!*/
selected_var = substr(possible_vars, (rand * 2 - 1), 2);
run;
%mend find_vars;
%find_vars;