我有两个不同的DF(caracteristica_receita
和coop_receita_anos2d
)。我需要比较它们的两列(CNPJ和ANO)。如果它们匹配,则需要在新列(caracteristica_receita$benford
)中添加“ 1”。
我一直在使用
caracteristica_receita$benford[which(caracteristica_receita$CNPJ %>%
is.element(coop_receita_anos2d$CNPJ))] <- 1
但是我不知道如何在两列中使用它。
caracteristica_receita <- structure(list(CNPJ = c(1234, 5678, 91012, 12346, 96385, 87952,
7789, 2535, 4459, 5457), NOME_INSTITUICAO = c("XXXX",
"AAAA", "BBBB", "CCCC", "DDDDD",
"RRRR", "FFFFF",
"GGGGG", "HHHHHH",
"IIIIIII"), ano_fundacao = c(1993,
1993, 1994, 1994, 1994, 1994, 1994, 1994, 1994, 1994), ANO = c(2014,
2015, 2014, 2015, 2016, 2014, 2014, 2015, 2016, 2017), benford = c(0, 0,
0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("CNPJ", "NOME_INSTITUICAO",
"ano_fundacao", "ANO", "benford"), row.names = c(NA, 10L), class = "data.frame")
和
coop_receita_anos2d <- structure(list(CNPJ = c(1234, 5678, 916862, 12346, 96385, 87952,
7789, 2535, 4459, 46868), ANO = c(2014, 2014, 0, 0, 0, 2014,
0, 0, 0, 0)), .Names = c("CNPJ",
"ANO"), row.names = c(1L, 3L,
7L, 11L, 15L, 19L, 23L, 27L, 31L, 35L), class = "data.frame")
所以,我想要
structure(list(CNPJ = c(1234, 5678, 91012, 12346, 96385, 87952,
7789, 2535, 4459, 5457), NOME_INSTITUICAO = c("XXXX",
"AAAA", "BBBB", "CCCC", "DDDDD",
"RRRR", "FFFFF",
"GGGGG", "HHHHHH",
"IIIIIII"), ano_fundacao = c(1993,
1993, 1994, 1994, 1994, 1994, 1994, 1994, 1994, 1994), ANO = c(2014,
2015, 2014, 2015, 2016, 2017, 2014, 2015, 2016, 2017), benford = c(1, 0,
0, 0, 0, 1, 0, 0, 0, 0)), .Names = c("CNPJ", "NOME_INSTITUICAO",
"ano_fundacao", "ANO", "benford"), row.names = c(NA, 10L), class = "data.frame")
答案 0 :(得分:0)
您可以将两列粘贴在一起,并使用match
。转换为布尔值,然后转换为整数,如下所示,
as.integer(!is.na(match(do.call(paste, caracteristica_receita[c('CNPJ', 'ANO')]),
do.call(paste, coop_receita_anos2d))))
#[1] 1 0 0 0 0 1 0 0 0 0
或将其分配回您的数据框,
caracteristica_receita$benford <- as.integer(!is.na(....))
答案 1 :(得分:0)
简单的基础R解决方案(假设df
和df2
的记录数相同):
df <- caracteristica_receita
df2 <- coop_receita_anos2d
ind <- df$ANO == df2$ANO & df$CNPJ == df2$CNPJ
df$benford <- ifelse(ind, 1, 0)
答案 2 :(得分:0)
谢谢你们!工作了!
此外,我的朋友也发送了此答案:
caracteristica_receita$benford[which(str_c(caracteristica_receita$CNPJ, caracteristica_receita$ANO) %>%
is.element(str_c(coop_receita_anos2d$CNPJ, coop_receita_anos2d$ANO)))] <- 1
非常感谢您!