Question

我的面板数据集看起来像这样

ID    Usage     month    
1234    2        -2  
1234    4        -1
1234    3         1
1234    2         2
2345    5        -2
2345    6        -1
2345    3         1
2345    6         2

显然，有更多的ID变量和使用数据，但这是一般形式。我希望在月份列为负时以及每个ID为正数时平均使用数据。换句话说，对于每个唯一ID，平均负月和正月的使用情况。我的目标是得到这样的东西。

ID   avg_usage_neg   avg_usage_pos
1234     3                  2.5
2345     5.5                4.5

Answer 1

这里有几个选项。

首先创建测试数据：

data sample;
  input ID    
        Usage     
        month;
datalines;
1234    2        -2  
1234    4        -1
1234    3         1
1234    2         2
2345    5        -2
2345    6        -1
2345    3         1
2345    6         2
;
run;

这是一个SQL解决方案：

proc sql noprint;
  create table result as
  select id,
         avg(ifn(month < 0, usage, .)) as avg_usage_neg,
         avg(ifn(month > 0, usage, .)) as avg_usage_pos
  from sample
  group by 1
  ;
quit;

这是一个datastep / proc意味着解决方案：

data sample2;
  set sample;
  usage_neg = ifn(month < 0, usage, .);
  usage_pos = ifn(month > 0, usage, .);
run;

proc means data=sample2 noprint missing nway;
  class id;
  var usage_neg usage_pos;
  output out=result2 mean=;
run;

SAS中的平均面板数据

1 个答案: