Question

如果在其他地方被问过，首先道歉。我不确定如何搜索它，所以我没有重新发布现有的问题。

我在R尝试根据另一列中存在的一列中的值过滤data.table时遇到一些奇怪的行为。我可能不是最好的方式，所以我愿意接受这方面的指导，但是我想更好地理解为什么R的行为方式。

我有一个数据集：

library(data.table)
dt <- data.table(GRP = c(rep("a","4"),rep("b","4")),
                 COLA = c("Type C plus more","Type C plus more", "Type D then some", "Type D then some"),
                 COLB = c("Type C","Type D"))

#    GRP             COLA   COLB
# 1:   a Type C plus more Type C
# 2:   a Type C plus more Type D
# 3:   a Type D then some Type C
# 4:   a Type D then some Type D
# 5:   b Type C plus more Type C
# 6:   b Type C plus more Type D
# 7:   b Type D then some Type C
# 8:   b Type D then some Type D

我想根据dt中现有COLB中的值来过滤COLA。我预计它会是某种形式的字符串或regex匹配，所以我认为使用grepl是合适的。

dt[grepl(COLB,COLA)]

#    GRP             COLA   COLB
# 1:   a Type C plus more Type C
# 2:   a Type C plus more Type D
# 3:   b Type C plus more Type C
# 4:   b Type C plus more Type D

即使我使用fixed = TRUE，我也会获得相同的输出。

COLA = "Type D plus more"我是如何获得FALSE的，COLA = "Type C plus more"我总是获得TRUE？

对于我grepl("Type D", "Type C plus more")时的记录，它会返回FALSE

奇怪的行为匹配data.table中的字符串

0 个答案: