Question

我在Stata中有数据集，看起来像这样

entityID    indicator    indicatordescr    indicatorvalue
1           gdp          Gross Domestic    100
1           pop          Population        15
1           area         Area              50
2           gdp          Gross Domestic    200
2           pop          Population        10
2           area         Area              300

并且indicator的值与indicatordescr的值之间存在一对一的映射。

我想重新塑造它，即：

entityID    gdp     pop     area
1           100     15      50
2           200     10      300

我希望gdp变量标签为“国内总值”，pop标签为“人口”，area为“区域”。

不幸的是，据我所知，无法将indicatordescr的值指定为indicator的值标签，因此重塑不能将这些值标签转换为变量标签。

我看过这个：Bring value labels to variable labels when reshaping wide

并且：http://www.stata.com/support/faqs/data-management/apply-labels-after-reshape/

但不明白如何将这些应用到我的案例中。

注意：重塑后的变量标签必须以编程方式完成，因为indicator和indicatordescr有很多值。

Answer 1

“字符串标签”这里是非正式的; Stata不支持字符串变量的值标签。但是，这里需要的是字符串变量的不同值在重新整形时变为变量标签。

存在各种解决方法。这是一个：将信息放在变量名中，然后再将其取出。

clear 
input entityID  str4 indicator   str14 indicatordescr    indicatorvalue
1           gdp          "Gross Domestic"    100
1           pop          "Population"        15
1           area         "Area"              50
2           gdp          "Gross Domestic"    200
2           pop          "Population"        10
2           area         "Area"              300
end 

gen what = indicator + "_"  + subinstr(indicatordescr, " ", "_", .)  
keep entityID what indicatorvalue 
reshape wide indicatorvalue , i(entityID) j(what) string 

foreach v of var indicator* {
    local V : subinstr local v "_" " ", all
    local new : word 1 of `V' 
    rename `v' `new'
    local V = substr("`V'", strpos("`V'", " ") + 1, .)
    label var `new' "`V'"
}

renpfix indicatorvalue

编辑如果变量名称的长度受到影响，请尝试另一种解决方法：

clear 
input entityID  str4 indicator   str14 indicatordescr    indicatorvalue
1           gdp          "Gross Domestic"    100
1           pop          "Population"        15
1           area         "Area"              50
2           gdp          "Gross Domestic"    200
2           pop          "Population"        10
2           area         "Area"              300
end 

mata : sdata = uniqrows(st_sdata(., "indicator indicatordescr")) 
keep entityID indicator indicatorvalue 
reshape wide indicatorvalue , i(entityID) j(indicator) string 
renpfix indicatorvalue 
mata : for(i = 1; i <= rows(sdata); i++) stata("label var " + sdata[i, 1] + "  " + char(34) + sdata[i,2] + char(34))
end

后期编辑虽然以上称为解决方案，但它比以前更好。

重塑后携带字符串变量的字符串标签

1 个答案: