Question

我的数据集与附图一样，我只想要每年都有相同数量的观察结果。

如何在SAS proc sql函数中执行此操作？这在STATA中会更容易吗？如果是这样，我可以使用哪种程序？

Answer 1

您看起来像stackoverflow的新用户。欢迎。你的问题是投票至少有三个原因：

1) It's not really clear what you want from your description of the problem and the data
   you're providing

2) You haven't shown any attempts at what you've tried

3) Providing your data as a picture is not great.  It's most helpful if you're going
   to provide data to provide it so it's easy for others to consume in their program.  
   After all, you're asking for our help make it easier for us to help you.  If You 
   included something like the following we just have to copy and paste to create your
   dataset to work with:

    DATA test;    
    INPUT ID YEAR EXEC SUM;
       DATALINES;
    1573 1997 50 1080
    1581 1997 51  300
    1598 1996 54   80
    1598 1998 54   80
    1598 1999 54   80
    1602 1996 55  112.6
    1602 1997 55  335.965
       ;
    RUN;

据说以下可能会给你你想要的东西，但这只是猜测，因为我不确定这是否真的是你所要求的：

proc sql no print;
     create table testout as
            select *,count(*) as cnt
      from test
            group by sum
                  having cnt > 1;
quit;

您是否在询问：显示使用相同SUM的所有行或其他内容？

Answer 2

假设我正确理解了您的问题，您希望仅在公司每年有相同数量的情况下保留同一公司/个人的观察结果。那么，我将尝试使用STATA：

input ID YEAR EXEC SUM
    1573 1997 50 1080 //
    1581 1997 51  300 //
    1598 1996 54   80 //
    1598 1998 54   80 //
    1598 1999 54   80 //
    1602 1996 55  112.6 //
    1602 1997 55  335.965 //
    1575 1997 50 1080 //
    1575 1998 51 1080 //
    1595 1996 54   80 //
    1595 1998 54   30 //
    1595 1999 54   80 //
    1605 1996 55  112.6 //
    1605 1997 55  335.965 //
end

bysort ID SUM: gen drop=cond(_N==1, 0,_n)
drop if drop==0

结果显示（根据我的数据）：

   
    ID      YEAR   EXEC  SUM    drop    

1.  1575    1997    50  1080    1   
2.  1575    1998    51  1080    2   
3.  1595    1999    54  80      1   
4.  1595    1996    54  80      2   
5.  1598    1996    54  80      1   

6.  1598    1998    54  80      2   
7.  1598    1999    54  80      3

SAS数据组织

2 个答案: