我有很多变量需要描述性统计(手段)。但是,作为列,我想使用分类变量(AlcCons1
)的值。
我使用以下代码来执行此操作:
tabstat Age25_29 Age30_34 ... SmokeY religAtndY, statistics( mean ) by(AlcCons1)
我得到了这样的结果:
AlcCons1 | Age25_29 Age30_34 Age35_39 Age40_44 Age45_49 Age50_54 Age55_59
---------+----------------------------------------------------------------------
1 | .0987326 .0936242 .1243994 .1668614 .1579665 .1481626 .1258278
2 | .1037879 .11853 .1451863 .1415631 .1317288 .1231884 .1387164
3 | .0905679 .1151016 .1405161 .1624963 .1506231 .137278 .123246
4 | .0649853 .0716117 .1094201 .1606857 .1786286 .1630888 .1401794
---------+----------------------------------------------------------------------
Total | .091001 .0986022 .1286311 .1617972 .156643 .144962 .1289952
------------------------------
如何交换列和行? (转置表格)
答案 0 :(得分:1)
原则上,答案是c(statistics)
。对于这种示例,这是合法的并且它产生一种转置,但结果不是精确的转置。这是一种做得更好的方法。
问题中没有可重现的例子,所以我们需要找到一个。
使用手段是偶然的。任何其他统计数据都会出现同样的问题。
这是我们可能想要转置的那种表格。
. sysuse census, clear
(1980 Census data by state)
. tabstat poplt5-pop65p , s(p50) by(region)
Summary statistics: p50
by categories of: region (Census region)
region | poplt5 pop5_17 pop18p pop65p
--------+----------------------------------------
NE | 185188 637731 2284657 364864
N Cntrl | 327094.5 936449 3126055 521880.5
South | 289571.5 880546 2803536 407053.5
West | 114731 303176 884987 109220
--------+----------------------------------------
Total | 227467.5 629654 2175130 370495
-------------------------------------------------
技巧1:通过获取仅包含我们想要制表的数据集的数据集来简化问题。
. collapse (p50) poplt5-pop65p, by(region)
. l
+---------------------------------------------------------+
| region poplt5 pop5_17 pop18p pop65p |
|---------------------------------------------------------|
1. | NE 185,188 637,731 2,284,657 364,864 |
2. | N Cntrl 327,094.5 936,449 3,126,054.5 521,880.5 |
3. | South 289,571.5 880,546 2,803,536 407,053.5 |
4. | West 114,731 303,176 884,987 109,220 |
+---------------------------------------------------------+
技巧2:使用reshape
将不同类别的不同变量映射到单个分类变量。
. reshape long pop, i(region) j(age) string
(note: j = 18p 5_17 65p lt5)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 4 -> 16
Number of variables 5 -> 3
j variable (4 values) -> age
xij variables:
pop18p pop5_17 ... poplt5 -> pop
-----------------------------------------------------------------------------
. l, sepby(region)
+------------------------------+
| region age pop |
|------------------------------|
1. | NE 18p 2,284,657 |
2. | NE 5_17 637,731 |
3. | NE 65p 364,864 |
4. | NE lt5 185,188 |
|------------------------------|
5. | N Cntrl 18p 3,126,054.5 |
6. | N Cntrl 5_17 936,449 |
7. | N Cntrl 65p 521,880.5 |
8. | N Cntrl lt5 327,094.5 |
|------------------------------|
9. | South 18p 2,803,536 |
10. | South 5_17 880,546 |
11. | South 65p 407,053.5 |
12. | South lt5 289,571.5 |
|------------------------------|
13. | West 18p 884,987 |
14. | West 5_17 303,176 |
15. | West 65p 109,220 |
16. | West lt5 114,731 |
+------------------------------+
技巧3:直接使用tabdisp
。
. tabdisp age region, c(pop)
--------------------------------------------------------------
| Census region
age | NE N Cntrl South West
----------+---------------------------------------------------
18p | 2,284,657 3,126,054.5 2,803,536 884,987
5_17 | 637,731 936,449 880,546 303,176
65p | 364,864 521,880.5 407,053.5 109,220
lt5 | 185,188 327,094.5 289,571.5 114,731
--------------------------------------------------------------
技巧4:可能需要进行一些清理工作。
. label def age 1 lt5 2 5_17 3 18p 4 65p
. encode age , gen(ageclass) label(age)
. tab ageclass
ageclass | Freq. Percent Cum.
------------+-----------------------------------
lt5 | 4 25.00 25.00
5_17 | 4 25.00 50.00
18p | 4 25.00 75.00
65p | 4 25.00 100.00
------------+-----------------------------------
Total | 16 100.00
. label def age 1 "<5" 2 "5-17" 3 "18-64" 4 "65+", modify
. tabdisp ageclass region, c(pop)
--------------------------------------------------------------
| Census region
ageclass | NE N Cntrl South West
----------+---------------------------------------------------
<5 | 185,188 327,094.5 289,571.5 114,731
5-17 | 637,731 936,449 880,546 303,176
18-64 | 2,284,657 3,126,054.5 2,803,536 884,987
65+ | 364,864 521,880.5 407,053.5 109,220
--------------------------------------------------------------
答案 1 :(得分:0)
我在以下链接中找到了答案:https://www.stata.com/statalist/archive/2005-09/msg00561.html 我试图转置表,所以我安装了命令:
ssc install tabstatmat, replace
tabstat Age25_29 Age30_34 CurntSmokeY religAtndY, by(AlcCons1) stat(mean) col(stat) long format(%9.2f) save
qui tabstatmat B
matrix B = B'
matrix list B, f(%9.2f)
我得到了我需要的东西:
B[41,5]
1: 2: 3: 4: Total:
mean mean mean mean mean
Age25_29 0.10 0.10 0.09 0.06 0.09
Age30_34 0.09 0.12 0.12 0.07 0.10
Age35_39 0.12 0.15 0.14 0.11 0.13
Age40_44 0.17 0.14 0.16 0.16 0.16
现在的问题是如何让它看起来更好(删除“mean”,用单词更改1,2,3,4)然后使用putexcel
命令?