在R中将一个数据集中的变量匹配到if else binned变量的另一个数据集

时间:2015-05-22 17:09:39

标签: r loops variables if-statement matching

我正在尝试根据另一个数据帧的某些预定义的bin来连接一些(75)连续变量。例如,数据框G具有我想要的所有箱,而数据框Test是连续变量是我需要谨慎的地方。例如,变量X3975具有bin截断点.0625和.1所以我需要编写ifelse语句,如下所示:

Ifelse((X3975 >=0 & X3975 <=.0625),”0-.0625”,
Ifelse((X3975 >=.0625 & X3975 <=.1),”.0625-.1”,
Ifelse((X3975 >= .1 ),”>.1”,

对于G数据集中的每个变量,以匹配Test数据集中的变量。这样做有效吗?

G数据帧数据集如下所示:

Bins   Variable 
  1. .0625 X3975
  2. .1 X3975
  3. .01 X3976
  4. .1 X3976 ...... 共有75个不同的变量,具有不同的箱数
  5. 测试数据集数据集

    X3001 X3100 X3102 .... X3999

1 个答案:

答案 0 :(得分:0)

您可以尝试cut

lst <- split(G$Bins, G$Variable)
df2 <- df1
df2[] <- Map(function(x,y) cut(x, breaks=c(-Inf,y,Inf)), df1, lst[names(df1)])

df2 

数据

df1 <- structure(list(X3001 = c(14, 14, NA, 10, 3, 5), X3100 = c(23, 
7, NA, 24, 7, 6), X3102 = c(1, 1, NA, 3, 0, 1), X3104 = c(0, 
0, NA, 2, 0, 0), X3109 = c(1, 1, NA, 7, 1, 1), X3111 = c(197, 
71, NA, 90, 177, 88), X3113 = c(37, 48, NA, 86, NA, 52), X3116 = c(197, 
71, NA, 76, 177, 88), X117 = c(197, NA, NA, NA, NA, NA)),
.Names = c("X3001", 
"X3100", "X3102", "X3104", "X3109", "X3111", "X3113", "X3116", 
"X117"), row.names = c(NA, -6L), class = "data.frame")

G <- structure(list(Bins = c(0, 7, 12, 0, 12, 22, 0, 1, 3, 0, 2, 0, 
6, 40, 150, 200, 10, 40, 90, 60, 180, 80, 180), Variable = c("X3001", 
"X3001", "X3001", "X3100", "X3100", "X3100", "X3102", "X3102", 
"X3102", "X3104", "X3104", "X3109", "X3109", "X3111", "X3111", 
"X3111", "X3113", "X3113", "X3113", "X3116", "X3116", "X117", 
"X117")), .Names = c("Bins", "Variable"), row.names = c(NA, -23L
 ), class = "data.frame")