我正在尝试使用调查数据将个人与其父母联系起来。具体来说,我有关于个人是否拥有学位的数据。如果父级有学位,我想创建一个等于1的指标变量。我有一个家庭标识符变量和一个变量,表明个人是父母还是孩子。
如果有人能就我如何做到这一点提供一些建议,那将不胜感激。如果您想提供示例代码,我在R和Stata方面经验丰富。
编辑:
我有这样的事情:
Person Family Characteristic Degree
1 1 Child No
2 1 Parent Yes
3 2 Child No
4 2 Parent No
我想要这样的事情:
Person Family Characteristic Degree Parent_Degree
1 1 Child No 1
3 2 Child No 0
答案 0 :(得分:1)
Stata的解决方案:
clear
input Person Family str9 Characteristic str3 Degree
1 1 Child No
2 1 Parent Yes
3 2 Child No
4 2 Parent No
end
gen byte_Degree = (Degree=="Yes")
bys Family: egen Parent_Degree = total(byte_Degree)
* in case both parents have a degree
replace Parent_Degree = (Parent_Degree > 0)
drop byte_Degree Degree
keep if Characteristic == "Child"
list
+---------------------------------------+
| Person Family Charac~c Parent~e |
|---------------------------------------|
1. | 1 1 Child 1 |
2. | 3 2 Child 0 |
+---------------------------------------+
答案 1 :(得分:0)
这是使用dplyr库的R解决方案:
library(dplyr)
df %>%
# within each family, let Parent_Degree be 1 if any parent has a degree
group_by(Family) %>%
mutate(Parent_Degree = as.integer(any(Characteristic == "Parent" & Degree == "Yes"))) %>%
ungroup() %>%
# keep only child rows
filter(Characteristic == "Child")
# A tibble: 2 x 5
Person Family Characteristic Degree Parent_Degree
<int> <int> <fctr> <fctr> <int>
1 1 1 Child No 1
2 3 2 Child No 0
数据:
df <- read.table(header = TRUE,
text = "Person Family Characteristic Degree
1 1 Child No
2 1 Parent Yes
3 2 Child No
4 2 Parent No")