我试图找到使用stata的国家之间的GDP之间的相关性。我正在使用Penn World Tables 8.1(http://www.rug.nl/research/ggdc/data/pwt/v81/pwt81.zip)提供的数据。这是一个包含大量宏观统计数据的海量数据表,但基本上我对变量country,year和rgdpna(GDP)感兴趣。
我一直在尝试为我感兴趣的每个国家创建新变量,并尝试使用pwcorr来关联这些变量。然而,这种方法会产生很多缺失的变量并且没有相关性。我的代码是:
/*We generate variables for countries*/
/*We use Sweden as reference point and find 5 near-by countries.
The chosen countries are Sweden, Norway, Finland, Germany, Denmark*/
gen swe = rgdpna if country == "Sweden" & year >= 1997
gen nor = rgdpna if country == "Norway" & year >= 1997
gen fin = rgdpna if country == "Finland" & year >= 1997
gen ger = rgdpna if country == "Germany" & year >= 1997
gen den = rgdpna if country == "Denmark" & year >= 1997
/*Then we choose 5 far-away countries. The chosen countries are
Canada, China, Japan, Russia, US*/
gen can = rgdpna if country == "Canada" & year >= 1997
gen usa = rgdpna if country == "United States" & year >= 1997
gen rus = rgdpna if country == "Russian Federation" & year >= 1997
gen chn = rgdpna if country == "China, People's Republic of" & year >= 1997
gen jap = rgdpna if country == "Japan" & year >= 1997
/*pwcorr the variables*/
pwcorr swe nor fin ger den can usa rus chn
这给出了以下结果:
| swe nor fin ger den can usa
-------------+---------------------------------------------------------------
swe | 1.0000
nor | . 1.0000
fin | . . 1.0000
ger | . . . 1.0000
den | . . . . 1.0000
can | . . . . . 1.0000
usa | . . . . . . 1.0000
rus | . . . . . . .
chn | . . . . . . .
| rus chn
-------------+------------------
rus | 1.0000
chn | . 1.0000
有人知道如何解决这个问题吗?
答案 0 :(得分:2)
您有一个面板数据结构,因此不同的国家/地区有不同的观察结果。因此,除非您将国家与自身进行比较,否则您不应对结果丢失感到惊讶。你首先需要reshape
,这样的事情。
keep if year >= 1997
local c1 inlist(country, "Sweden", "Norway", "Finland", "Germany", "Denmark")
local c2 inlist(country, "Canada", "United States", "Russian Federation", "China, People's Republic of", "Japan")
keep if `c1' | `c2'
separate rgdpna, by(country) veryshortlabel
drop rgdpna country
reshape wide rgdpna, i(year) j(country)
pwcorr rgdpna*