将提名与其代码相匹配

时间:2016-05-12 21:18:43

标签: stata social-networking network-analysis

我正在进行网络分析,我有一个看起来像这样的数据集

**ID-code | ego  |  alter1  |alter2 |alter3 |Office**
100       | JHON |  ROCKY   |JOE    |MOLLY  |   1
101       |ROCKY |  JOE     |MOLLY  |JHON   |   1
102       | JOE  |  MOLLY   |JHON   |  .    |   1
103       | MOLLY|  ROCKY   | .     |  .    |   1 

正如您所看到的,每个自我都被要求从同一个办公室命名最多三个改变。

我想将ID代码与其名称相匹配,以获得类似这样的新变量/列

   **ID-code ego|   ID_alter1   |ID_alter2  |ID_alter3**
    100JHON     |    101ROCKY   |102JOE     |103MOLLY
    101ROCKY    |    102JOE     |103MOLLY   |100JHON
    102JOE      |    103MOLLY   |100JHON    |    .
    103MOLLY    |    101ROCKY   |  .        |    .

我已经知道如何获取变量ID代码自我:

*egen ID-code ego= concat (ID-code ego)*

但我不知道如何将其他观察与他们的ID代码相匹配。

欢迎任何建议。

谢谢, AMEDEO

2 个答案:

答案 0 :(得分:1)

Kevin Crow写了一个vlookup克隆,让这很简单:

clear
input int id_code str5 ego str5 alter1 str5 alter2 str5 alter3
100 "JOHN" "ROCKY" "JOE" "MOLLY"
101 "ROCKY" "JOE" "MOLLY" "JOHN"
102 "JOE" "MOLLY" "JOHN" ""
103 "MOLLY" "ROCKY" "" ""
end
capture net install vlookup, from(http://www.stata.com/users/kcrow)
gen id_code_ego = string(id) + ego
forvalues i=1/3 {
    vlookup alter`i', gen(code) key(ego) value(id_code)
    gen id_alter`i' = string(code) + alter`i'
    drop alter`i' code
}
drop id_code ego

附录:

clear
input int id_code str5 ego str5 alter1 str5 alter2 str5 alter3 int officer
100 "JOHN" "ROCKY" "JOE" "MOLLY" 1
101 "ROCKY" "JOE" "MOLLY" "JOHN" 1
102 "JOE" "MOLLY" "JOHN" "" 1
103 "MOLLY" "ROCKY" "" "" 1
103 "JOHN" "ROCKY" "JOE" "MOLLY" 2
102 "ROCKY" "JOE" "MOLLY" "JOHN" 2
101 "JOE" "MOLLY" "JOHN" "" 2
100 "MOLLY" "ROCKY" "" "" 2
end
capture net install vlookup, from(http://www.stata.com/users/kcrow)

gen id_code_ego_officer = string(id) + ego + string(officer)
gen ego_officer = ego + string(office)

forvalues i=1/3 {
    replace alter`i'= alter`i' + string(officer) 
    vlookup alter`i', gen(code) key(ego_officer) value(id_code)
    gen id_alter`i' = string(code) + alter`i'
    replace id_alter`i' = regexr(id_alter`i',"[0-9]?$","")
    drop alter`i' code  
}

drop id_code_ego_officer ego_officer

答案 1 :(得分:1)

要匹配其他观察值,Stata中的典型方法是使用merge。在第一步中,您将创建每个办公室的自我不同值的主列表。然后,您将返回原始数据并将每个更改与不同的办公室名称合并。执行合并需要一些变量名称重命名:

clear
input int id_code str5 ego str5 alter1 str5 alter2 str5 alter3 int office
100 "JOHN" "ROCKY" "JOE" "MOLLY" 1
101 "ROCKY" "JOE" "MOLLY" "JOHN" 1
102 "JOE" "MOLLY" "JOHN" "" 1
103 "MOLLY" "ROCKY" "" "" 1
103 "JOHN" "ROCKY" "JOE" "MOLLY" 2
102 "ROCKY" "JOE" "MOLLY" "JOHN" 2
101 "JOE" "MOLLY" "JOHN" "" 2
100 "MOLLY" "ROCKY" "" "" 2
end

* make a master list of unique id/name per office
preserve
keep office id_code ego
isid office id_code ego, sort
rename (id_code ego) (id0 ego0)
save "match_egos.dta", replace
restore

* combine the id/ego for each observation
gen ID_ego = string(id_code) + ego

* loop over each alter and merge with the master list
forvalues i = 1/3 {
    clonevar ego0 = alter`i'
    merge m:1 office ego0 using "match_egos.dta", keep(master match) nogen
    gen ID_alter`i' = string(id0) + alter`i'
    drop ego0 id0
}

isid office id_code ego, sort
* leftalign is from SSC; to install, type in Command window: ssc install left align
leftalign
list ID_*, sepby(office)