我有一个要求,我需要查看一个数据框中的值,并检查其他数据框并创建一个新的分类列。这看起来像vlookup一样简单,但是我无法在R中做到这一点。
我的数据很大,有数千行和多列。下面我尝试创建类似格式的示例数据。
#####Generating sample data
library("plyr")
library("data.table")
library("rpart")
set.seed(1200)
id <- 1:1000
ibd <- sample(1:15,1000,replace = T)
bills <- sample(1:20,1000,replace = T)
nos <- sample(1:80,1000,replace = T)
stru <- sample(c("A","B","C","D"),1000,replace = T)
v1 <- sample(1:80,1000,replace = T)
v2 <- sample(1:80,1000,replace = T)
v3 <- sample(1:80,1000,replace = T)
v4 <- sample(1:80,1000,replace = T)
v5 <- sample(1:80,1000,replace = T)
v6 <- sample(1:80,1000,replace = T)
v7 <- sample(1:80,1000,replace = T)
v8 <- sample(1:80,1000,replace = T)
v9 <- sample(1:80,1000,replace = T)
v10 <- sample(1:80,1000,replace = T)
a1 <- sample(1:80,1000,replace = T)
b1 <- sample(1:80,1000,replace = T)
type <- sample(1:15,1000,replace = T)
value <- sample(100:1000,1000,replace = T)
df1 <- data.frame(id,ibd,bills,nos,stru,v1,v2,v3,v4,v5,v6,v7,v8,v9,v10,a1,b1,type,value)
num_var <- c("bills","nos","v1","v2","v3")
v0 <- num_var
ibda <- sort(rep(1:15,4),decreasing = F)
billsa <- sample(5:15,60,replace = T)
nosa <- sample(15:60,60,replace = T)
v1a <- sample(10:70,60,replace = T)
v2a <- sample(20:70,60,replace = T)
v3a <- sample(20:70,60,replace = T)
df2 <- data.frame(ibda,billsa,nosa,v1a,v2a,v3a)
bills_ibd1 <- sort(df2[ibda == 1,"billsa"])
因此,如果您看到 bills_ibd1 包含 05,10,13,15 。我想在 db1 中检查 ibd == 1 中的这些值,并在中创建分类变量&#34; bills_cat&#34; > df1 ,其代码如下
if (ibd == 1 & bills_ibd1 <= 05) bills_cat = 1
if (ibd == 1 & bills_ibd1 > 05 & bills_ibd1 <= 10) bills_cat = 2
if (ibd == 1 & bills_ibd1 > 10 & bills_ibd1 <= 13) bills_cat = 3
if (ibd == 1 & bills_ibd1 > 13 & bills_ibd1 <= 15) bills_cat = 4
if (ibd == 1 & bills_ibd1 > 15 ) bills_cat = 5
注意 - bills_ibd1 是从 df2 生成的,我会为每个ibd和列变量都有这样的变量。
但是这样我必须编写很多if语句,并观察变量bills_cat被替换。
有没有一种简单而更好的方法来实现这一目标?我需要根据 df2 中的值,在 ibd 级别检查 df1 中的变量。请建议