我有两个看起来与此相似的数据框:
>health
ID Stroke Diab MI Age Sex
1 1 0 0 0 65 M
2 2 0 0 0 66 M
3 3 1 0 0 78 F
4 4 0 0 0 55 M
5 5 0 0 0 67 M
6 6 1 1 1 66 M
7 7 0 0 0 79 F
8 8 0 0 0 54 M
9 9 0 0 0 65 F
10 10 1 1 1 78 F
>Asthma
ID Smoker Smoking_Status
1 12 2 0
2 15 0 1
3 24 1 0
4 2 2 1
5 8 2 0
6 53 1 1
7 10 0 0
8 32 0 0
9 1 0 0
10 5 1 1
这些是我用来生成示例表的代码
health <- data.frame(ID=c(1,2,3,4,5,6,7,8,9,10), Stroke = factor(c(0,0,1,0,0,1,0,0,0,1)),
Diab = factor(c(0,0,0,0,0,1,0,0,0,1)), MI = factor(c(0,0,0,0,0,1,0,0,0,1)),
Age = factor(c(65,66,78,55,67,66,79,54,65,78)),
Sex = factor(c("M","M","F","M","M","M","F","M","F","F")))
Asthma <- data.frame(ID=c(12,15,24,2,8,53,10,32,1,5), Smoker = factor(c(2,0,1,2,2,1,0,0,0,1)),
Smoking_Status = factor(c(0,1,0,1,0,1,0,0,0,1)))
我的问题是我如何在健康数据框中生成另一列,该列的值将为1,以显示ID是否出现在哮喘数据框中。
这是我的预期结果:
ID Asthma Stroke Diab MI Age Sex
1 1 1 0 0 0 65 M
2 2 1 0 0 0 66 M
3 3 0 1 0 0 78 F
4 4 0 0 0 0 55 M
5 5 1 0 0 0 67 M
6 6 0 1 1 1 66 M
7 7 0 0 0 0 79 F
8 8 0 0 0 0 54 M
9 9 0 0 0 0 65 F
10 10 1 1 1 1 78 F
答案 0 :(得分:0)
许多可能的方式之一:
health$asthma =match(x = health$ID,table = Asthma$ID,nomatch = 0)
health$asthma = replace(x = health$asthma,list = which(health$asthma>0),values = 1)
使用data.table
:
health = as.data.table(x = health)
Asthma = as.data.table(x = Asthma)
health[,`:=`(asthma = numeric(nrow(health)))]
set(x = health,i = which(health$ID %in% Asthma$ID),j = "asthma",value = 1)
#> health
# ID Stroke Diab MI Age Sex asthma
# 1: 1 0 0 0 65 M 1
# 2: 2 0 0 0 66 M 1
# 3: 3 1 0 0 78 F 0
# 4: 4 0 0 0 55 M 0
# 5: 5 0 0 0 67 M 1
# 6: 6 1 1 1 66 M 0
# 7: 7 0 0 0 79 F 0
# 8: 8 0 0 0 54 M 1
# 9: 9 0 0 0 65 F 0
#10: 10 1 1 1 78 F 1
答案 1 :(得分:0)
您可以使用data.table
软件包在一行中完成此操作-
> data.table::setDT(health)[,ind:=ifelse(ID %in% Asthma$ID,1,0)]
> health
ID Stroke Diab MI Age Sex id_app ind
1: 1 0 0 0 65 M 1 1
2: 2 0 0 0 66 M 1 1
3: 3 1 0 0 78 F 0 0
4: 4 0 0 0 55 M 0 0
5: 5 0 0 0 67 M 1 1
6: 6 1 1 1 66 M 0 0
7: 7 0 0 0 79 F 0 0
8: 8 0 0 0 54 M 1 1
9: 9 0 0 0 65 F 0 0
10: 10 1 1 1 78 F 1 1