Question

如何在R中创建具有匹配来自单个数据帧的多个条件的新变量。我想从以下数据集创建新变量（couple_smokr）。夫妻相关变量不在数据集中，需要从现有变量创建（夫妻将是那些男性和女性具有相似群集，houseno和partnernum的人）。如果任何人有命令创建这个（couple_smoke）变量，我们将不胜感激。

View(afgan)
sex    cluster      houseno     partnernum   smoke    **couple_smoke**
male     1            4             2         yes          yes
female   1            4             2         yes          yes
male     1            4             1         no            no
male     3            10            1         no            no
female   3            10            1         yes           no
female   4            4             2          no           no
female   4            4             1          no           no  
male     4            4             3          no           no

Answer 1

我猜你定义couple_smoke当一对夫妇住在同一个家庭并且他们两个都吸烟时，所以他们也应该为smoke变量提供相同的输入cluster，houseno和partnernum。我对么？

然后以下应该做的诀窍：首先输入数据（请在csgroen指出下次提供输入代码）

afgan <- structure(list(
  sex = structure(c(2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L), 
                  .Label = c("female", "male"), class = "factor"), 
  cluster= c(1, 1, 1, 3, 3, 4, 4, 4), 
  houseno= c(4, 4, 4, 10, 10, 4, 4, 4), 
  partnernum= c(2, 2, 1, 1, 1, 2, 1, 3), 
  smoke = structure(c(1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L), 
    .Label = c("yes", "no"), class = "factor")),
  .Names = c("sex", "cluster", "houseno", "partnernum", "smoke"), 
  row.names = c(NA, 8L), class = "data.frame")

然后，

library(tidyverse)
library(magrittr)
afgan %<>% 
  group_by(cluster, houseno, partnernum, smoke) %>% 
  mutate(couple_smoke = ifelse(n() > 1, 1, 0))

dplyr包的n（）函数计算每个组中的行数。

Answer 2

考虑基数R ave()，你传递一个等于nrow() df的向量，以便求和。

df$couple_smoke <- ifelse(ave(rep(1, nrow(df)), df$cluster, df$houseno,
                          df$partnernum, df$smoke, FUN=sum) > 1, 'yes', 'no')

如何在r中创建多个列和行条件的新变量

2 个答案: