我有以下2个数据表:
DT1 <- data.table(A = c(100,50,10), B = c("Good","Ok","Bad"))
DT1
A B
1: 100 Good
2: 50 Ok
3: 10 Bad
和
DT2 <- data.table(A = c(99,34,5,"",24,86))
DT2
A
1: 99
2: 34
3: 5
4:
5: 24
6: 86
加入DT1和DT2时我想返回的内容是
DT2
A B
1: 99 Good
2: 34 Ok
3: 5 Bad
4: NA
5: 24 Ok
6: 86 Good
data.table中的“ roll”选项仅用于“最近”匹配,因此在我的情况下不起作用。有什么办法可以对data.table进行这种查找吗?
答案 0 :(得分:2)
滚动连接如果向后滚动(NOCB =向后携带的下一个礼节)确实对我有用:
library(data.table)
DT1 <- data.table(A = c(100, 50, 10), B = c("Good", "Ok", "Bad"))
DT2 <- data.table(A = c(99, 34, 5, "", 24, 86))
DT2[, A := as.numeric(A)]
DT1[DT2, on = "A", roll = -Inf]
A B 1: 99 Good 2: 34 Ok 3: 5 Bad 4: NA <NA> 5: 24 Ok 6: 86 Good
请注意,只有两列A
均为数字(或整数)时,此方法才有效。通过使用""
,OP将DT2$A
变成了字符列。
答案 1 :(得分:1)
这是基本的R方法
df1 <- as.data.frame(DT1)
df2 <- as.data.frame(DT2)
df2$B <- apply(df2, 1, function(x) {
if(x != "") df1$B[which.min(abs(as.numeric(x) - df1$A))] else NA
})
df2
# A B
# 1 99 Good
# 2 34 Ok
# 3 0 Bad
# 4 <NA>
# 5 24 Bad
# 6 86 Good
或者使用data.table
s
DT2[, B := apply(DT2, 1, function(x)
if(x != "") DT1$B[which.min(abs(as.numeric(x) - DT1$A))] else NA)]
DT2
# A B
#1: 99 Good
#2: 34 Ok
#3: 0 Bad
#4: NA
#5: 24 Bad
#6: 86 Good
我们根据DT1$A
和DT2$A
值之间的最小绝对差进行匹配。
DT1 <- data.table(A = c(100,50,0), B = c("Good","Ok","Bad"))
DT2 <- data.table(A = c(99,34,0,"",24,86))