计算一个变量的总和

时间:2018-11-10 15:26:10

标签: stata

我想计算变量boasav的总和:

clear

input id boasav
1 2500
1 2900
1 4200
2 5700
2 6100
3 7400
3 7600
3 8300
end

我知道tabulate命令可用于汇总数据,但仅计数:

bys id: tab boasav 

-> id = 1

     boasav |      Freq.     Percent        Cum.
------------+-----------------------------------
       2500 |          1       33.33       33.33
       2900 |          1       33.33       66.67
       4200 |          1       33.33      100.00
------------+-----------------------------------
      Total |          3      100.00

-> id = 2

     boasav |      Freq.     Percent        Cum.
------------+-----------------------------------
       5700 |          1       50.00       50.00
       6100 |          1       50.00      100.00
------------+-----------------------------------
      Total |          2      100.00

-> id = 3

     boasav |      Freq.     Percent        Cum.
------------+-----------------------------------
       7400 |          1       33.33       33.33
       7600 |          1       33.33       66.67
       8300 |          1       33.33      100.00
------------+-----------------------------------
      Total |          3      100.00

但是,我想要的是以下内容:

1    9600  
2   11800  
3   23300 

在Stata中是否有可以做到这一点的功能?

2 个答案:

答案 0 :(得分:1)

解决方案1:使用listtable命令计算并显示

bysort id: list, sum(boasav)

-> id = 1
      +-------------+
      | id   boasav |
      |-------------|
   1. |  1     2500 |
   2. |  1     2900 |
   3. |  1     4200 |
      |-------------|
  Sum |        9600 |
      +-------------+

-> id = 2
      +-------------+
      | id   boasav |
      |-------------|
   1. |  2     5700 |
   2. |  2     6100 |
      |-------------|
  Sum |       11800 |
      +-------------+

-> id = 3  
      +-------------+
      | id   boasav |
      |-------------|
   1. |  3     7400 |
   2. |  3     7600 |
   3. |  3     8300 |
      |-------------|
  Sum |       23300 |
      +-------------+

table id, contents(sum boasav)

-----------------------
       id | sum(boasav)
----------+------------
        1 |        9600
        2 |       11800
        3 |       23300
-----------------------

解决方案2:在结果中生成额外的变量,然后列出

bysort id (boasav): generate sum1 = sum(boasav)

by id: egen sum2 = total(boasav)

这两种方法都会产生相同的结果:

by id: list sum* if _n == _N

-> id = 1  
     +-------------+
     | sum1   sum2 |
     |-------------|
  3. | 9600   9600 |
     +-------------+

-> id = 2  
     +---------------+
     |  sum1    sum2 |
     |---------------|
  2. | 11800   11800 |
     +---------------+

-> id = 3 
     +---------------+
     |  sum1    sum2 |
     |---------------|
  3. | 23300   23300 |
     +---------------+

解决方案3:使用结果和列表创建新的数据集

collapse (sum) boasav, by(id)
list

     +-------------+
     | id   boasav |
     |-------------|
  1. |  1     9600 |
  2. |  2    11800 |
  3. |  3    23300 |
     +-------------+

请注意,最后一个解决方案将破坏您当前的数据集。

答案 1 :(得分:1)

这里还有三个。

clear

input id boasav
1 2500
1 2900
1 4200
2 5700
2 6100
3 7400
3 7600
3 8300
end

* Method 4: use summarize 

forval g = 1/3 { 
    su boasav if id == `g', meanonly 
    di "`g'  " %5.0f r(sum) 
} 

1   9600
2  11800
3  23300


* Method 5: tabstat 

tabstat boasav, by(id) stat(sum) 

Summary for variables: boasav
     by categories of: id 

      id |       sum
---------+----------
       1 |      9600
       2 |     11800
       3 |     23300
---------+----------
   Total |     44700
--------------------


* Method 6: use rangestat (SSC) 

rangestat (sum) boasav, int(id 0 0)

tabdisp id, c(boasav_sum)

-------------------------
       id | sum of boasav
----------+--------------
        1 |          9600
        2 |         11800
        3 |         23300
-------------------------