我正在学习数据表的工作方式,并且我尝试在两列(grep()
和id1
)上使用id2
来删除不在的行#39 ; t返回TRUE
。
我知道我必须使用函数lapply()
,但它总是返回跟随的错误:
argument 'pattern' has length > 1 and only the first element will be used
我试过了(我知道这是错的):
DT[, lapply(.SD, grepl(id1, id2)), by= id]
我正在处理的数据:
structure(list(id = c(52L, 52L, 52L, 52L, 54L, 54L, 84L, 84L,
87L, 87L, 129L, 129L, 130L, 130L, 130L), id1 = c("8113H187",
"3505H6", "3505H6", "3505H6", "3505H6", "3505H6", "3505H6", "3505H6",
"8113H187", "8113H187", "3505H6", "3505H6", "3505H6", "3505H6",
"3505H6"), id2 = c("3505H6856", "3505H6856", "3505H6856", "3505H6856",
"3505H67158", "3505H67158", "3505H63188", "3505H63188", "3505H64691",
"3505H64691", "3505H664133", "3505H664133", "3505H658134", "3505H658134",
"3505H658134")), .Names = c("id", "id1", "id2"), row.names = c(NA,
-15L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x00000000064f0788>)
答案 0 :(得分:1)
我们可以使用Map
将'id1'中的相应元素pattern
与'ID2'中的元素进行比较
DT[unlist(Map(grepl, id1, id2))]
答案 1 :(得分:1)
DT[mapply( grepl, id1, id2), ]
# id id1 id2
# 1: 52 3505H6 3505H6856
# 2: 52 3505H6 3505H6856
# 3: 52 3505H6 3505H6856
# 4: 54 3505H6 3505H67158
# 5: 54 3505H6 3505H67158
# 6: 84 3505H6 3505H63188
# 7: 84 3505H6 3505H63188
# 8: 129 3505H6 3505H664133
# 9: 129 3505H6 3505H664133
# 10: 130 3505H6 3505H658134
# 11: 130 3505H6 3505H658134
# 12: 130 3505H6 3505H658134