我在Stata中进行了回归:
reg y I.ind1990#I.year, nocons r
然后我使用
从Stata导出系数向量matrix x = e(b)
esttab matrix(x) using "xx.csv", replace plain
并使用
将其加载到Python和pandas
中
df = pd.read_csv('xx.csv', skiprows=1, index_col=[0]).T.dropna()
df.index.name = 'interaction'
df = df.reset_index()
ind1990
和year
是数字。但是我的csv中有一些奇怪的值(年份和ind被手动拉出interaction
):
interaction y1 ind year
0 0b.ind1990#2001b.year 0.000000 0b 2001b
1 0b.ind1990#2002.year 0.320578 0b 2002
2 0b.ind1990#2003.year 0.304471 0b 2003
3 0b.ind1990#2004.year 0.271429 0b 2004
4 0b.ind1990#2005.year 0.295347 0b 2005
我相信0b
是Stata如何翻译缺失的值,即NIU。但我无法理解其他非数字值。
这是我多年来得到的(并且b
和o
都是意外的后缀:
array(['2001b', '2002', '2003', '2004', '2005', '2006', '2007', '2008',
'2009', '2010', '2011', '2012', '2013', '2014', '2015', '2004o',
'2008o', '2012o', '2003o', '2005o', '2006o', '2007o', '2009o',
'2010o', '2011o', '2013o', '2014o', '2015o', '2002o'], dtype=object)
和ind1990(其中0b
显然是NIU,但也有o
个后缀我无法理解:
array(['0b', '10', '11', '12', '20', '31', '32', '40', '41', '42', '50',
'60', '100', '101', '102', '110', '111', '112', '120', '121', '122',
'122o', '130', '130o', '132', '140', '141', '142', '150', '151',
'152', '152o', '160', '161', '162', '171', '172', '180', '181',
'182', '190', '191', '192', '200', '201', '201o', '210', '211',
'220', '220o', '221', '221o', '222', '222o', '230', '231', '232',
'241', '242', '250', '251', '252', '261', '262', '270', '271',
'272o', '272'], dtype=object)
b
和o
后缀在交互列的值末尾的含义是什么?
答案 0 :(得分:0)
这不是一个答案,但它不会成为一个评论,它可能会澄清这个问题。
如果没有@FooBar的数据,这里的例子是不可复制的。这是另一个(a)Stata用户可以复制的内容,(b)我认为Python用户可以导入:
. sysuse auto, clear
(1978 Automobile Data)
. regress mpg i.foreign#i.rep78, nocons r
note: 1.foreign#1b.rep78 identifies no observations in the sample
note: 1.foreign#2.rep78 identifies no observations in the sample
Linear regression Number of obs = 69
F(7, 62) = 364.28
Prob > F = 0.0000
R-squared = 0.9291
Root MSE = 6.1992
-------------------------------------------------------------------------------
| Robust
mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
foreign#rep78 |
Domestic#2 | 19.125 1.311239 14.59 0.000 16.50387 21.74613
Domestic#3 | 19 .8139726 23.34 0.000 17.37289 20.62711
Domestic#4 | 18.44444 1.520295 12.13 0.000 15.40542 21.48347
Domestic#5 | 32 1.491914 21.45 0.000 29.01771 34.98229
Foreign#1 | 0 (empty)
Foreign#2 | 0 (empty)
Foreign#3 | 23.33333 1.251522 18.64 0.000 20.83158 25.83509
Foreign#4 | 24.88889 .8995035 27.67 0.000 23.09081 26.68697
Foreign#5 | 26.33333 3.105666 8.48 0.000 20.1252 32.54147
-------------------------------------------------------------------------------
. matrix b = e(b)
. esttab matrix(b) using b.csv, plain
(output written to b.csv)
b.csv文件如下所示:
"","b","","","","","","","","",""
"","0b.foreign#1b.rep78","0b.foreign#2.rep78","0b.foreign#3.rep78","0b.foreign#4.rep78","0b.foreign#5.rep78","1o.foreign#1b.rep78","1o.foreign#2o.rep78","1.foreign#3.rep78","1.foreign#4.rep78","1.foreign#5.rep78"
"y1","0","19.125","19","18.44444","32","0","0","23.33333","24.88889","26.33333"
非Stata用户可以访问Stata的符号。见enter link description here
我不使用esttab
(用户编写的Stata程序)或Python(这是无知,不是偏见),所以除此之外我不能发表评论。