Question

我有一个数据框，其中一个字段包含不同的数字。但是，它也包含一些0/000/00000000。如何在给定的数据集中识别所有包含0,00,000,0000,00000等的值，直到0000000000，并显示所有这些记录？对所有组合使用OR逻辑运算符似乎很乏味。还有其他解决方法吗？

Answer 1

使用正则表达式。我认为它是一个字符向量。

grep("^0+$", df$col)

Answer 2

创建示例数据：

set.seed(100)
library('data.table')
nums <- sample(c(11101, 11001, 10001, 99991, 99992, 99993), 52, T)
DT <- data.table(A = LETTERS, B = nums)

使用data.table：

DT[, B := as.character(B)]
subDT <- DT[B %like% '0']

使用data.frame和data.table：

setDF(DT)
subDT <- DT[like(DT$B, '0'),]

使用data.frame和dplyr：

library('dplyr')
subDT <- DT %>%
  filter(grepl('0', B, T))

使用data.frame和stringi：

library('stringi')
subDT <- DT[stri_detect_fixed(DT$B, '0'),]
# if you're only interested in leading 0's
subDT <- DT[stri_detect_regex(DT$B, '^0+'),]