如何对字符向量中的所有整数和/或数字值进行子集化?

时间:2019-06-07 08:36:37

标签: r regex indexing subset

给出一个字符串向量,例如:

set.seed(1)
chr_vec <- c(sample(1:100000, 10), "12to145", "15:19", sample(1:100000, 10), "111.333", "567.1")

如何对所有整数字符串进行子集处理?例如:

int_vec <- chr_vec[c(1:10, 13:22)]

如何对所有数字字符串进行子集化?例如:

num_vec <- chr_vec[c(1:10, 13:24)]

2 个答案:

答案 0 :(得分:2)

您可以使用gsub删除数字并与空格(整数)或空格和点(数字)匹配,即

ints <- chr_vec[gsub('\\d+', '', chr_vec) == '']
numerics <- chr_vec[gsub('\\d+', '', chr_vec) %in% c('', '.')]

测试

identical(numerics, num_vec)
#[1] TRUE
identical(ints, int_vec)
#[1] TRUE

答案 1 :(得分:0)

我们可以在模式中使用grep

对于整数

grep("^\\d+$", chr_vec, value = TRUE)

#[1] "26551" "37213" "57285" "90819" "20168" "89835" "94462" 
#    "66076" "62907" "6179"  "20598" "17656" "68701" "38410" 
#    "76982" "49768" "71758" "99184" "38001" "77738"

和数字

grep("^\\d+(\\.\\d+)?$", chr_vec, value = TRUE)

#[1] "26551" "37213" "57285" "90819" "20168" "89835" "94462"   
#    "66076" "62907" "6179" "20598" "17656" "68701"  "38410"   
#    "76982" "49768" "71758" "99184" "38001" "77738" "111.333" "567.1"