我想计算变量boasav
的总和:
clear
input id boasav
1 2500
1 2900
1 4200
2 5700
2 6100
3 7400
3 7600
3 8300
end
我知道tabulate
命令可用于汇总数据,但仅计数:
bys id: tab boasav
-> id = 1
boasav | Freq. Percent Cum.
------------+-----------------------------------
2500 | 1 33.33 33.33
2900 | 1 33.33 66.67
4200 | 1 33.33 100.00
------------+-----------------------------------
Total | 3 100.00
-> id = 2
boasav | Freq. Percent Cum.
------------+-----------------------------------
5700 | 1 50.00 50.00
6100 | 1 50.00 100.00
------------+-----------------------------------
Total | 2 100.00
-> id = 3
boasav | Freq. Percent Cum.
------------+-----------------------------------
7400 | 1 33.33 33.33
7600 | 1 33.33 66.67
8300 | 1 33.33 100.00
------------+-----------------------------------
Total | 3 100.00
但是,我想要的是以下内容:
1 9600
2 11800
3 23300
在Stata中是否有可以做到这一点的功能?
答案 0 :(得分:1)
解决方案1:使用list
或table
命令计算并显示
bysort id: list, sum(boasav)
-> id = 1
+-------------+
| id boasav |
|-------------|
1. | 1 2500 |
2. | 1 2900 |
3. | 1 4200 |
|-------------|
Sum | 9600 |
+-------------+
-> id = 2
+-------------+
| id boasav |
|-------------|
1. | 2 5700 |
2. | 2 6100 |
|-------------|
Sum | 11800 |
+-------------+
-> id = 3
+-------------+
| id boasav |
|-------------|
1. | 3 7400 |
2. | 3 7600 |
3. | 3 8300 |
|-------------|
Sum | 23300 |
+-------------+
table id, contents(sum boasav)
-----------------------
id | sum(boasav)
----------+------------
1 | 9600
2 | 11800
3 | 23300
-----------------------
解决方案2:在结果中生成额外的变量,然后列出
bysort id (boasav): generate sum1 = sum(boasav)
或
by id: egen sum2 = total(boasav)
这两种方法都会产生相同的结果:
by id: list sum* if _n == _N
-> id = 1
+-------------+
| sum1 sum2 |
|-------------|
3. | 9600 9600 |
+-------------+
-> id = 2
+---------------+
| sum1 sum2 |
|---------------|
2. | 11800 11800 |
+---------------+
-> id = 3
+---------------+
| sum1 sum2 |
|---------------|
3. | 23300 23300 |
+---------------+
解决方案3:使用结果和列表创建新的数据集
collapse (sum) boasav, by(id)
list
+-------------+
| id boasav |
|-------------|
1. | 1 9600 |
2. | 2 11800 |
3. | 3 23300 |
+-------------+
请注意,最后一个解决方案将破坏您当前的数据集。
答案 1 :(得分:1)
这里还有三个。
clear
input id boasav
1 2500
1 2900
1 4200
2 5700
2 6100
3 7400
3 7600
3 8300
end
* Method 4: use summarize
forval g = 1/3 {
su boasav if id == `g', meanonly
di "`g' " %5.0f r(sum)
}
1 9600
2 11800
3 23300
* Method 5: tabstat
tabstat boasav, by(id) stat(sum)
Summary for variables: boasav
by categories of: id
id | sum
---------+----------
1 | 9600
2 | 11800
3 | 23300
---------+----------
Total | 44700
--------------------
* Method 6: use rangestat (SSC)
rangestat (sum) boasav, int(id 0 0)
tabdisp id, c(boasav_sum)
-------------------------
id | sum of boasav
----------+--------------
1 | 9600
2 | 11800
3 | 23300
-------------------------