'size'列包含类似
的文本row_1 = "Small size From 3 mm long when unfed to 9 mm when fully engorged"
row_2 = "Tiny some microscopic Red mite only 0 4 mm diameter Worldwide many different"
row_3 = "Small spiders body length about 10 mm"
size = c(row_1, row_2, row_3)
如何提取新列中的数据,如“ new_size”所示,如下所示:
size_1 = '3mm, 9mm'
size_2 = '4mm'
size_3 = '10mm'
new_size = c(size_1, size_2, size_3)
我已经看到了子字符串方法,但是无法找出从每一行中不同文本中获取大小的方法。
答案 0 :(得分:1)
尝试一下:
Numb_Extract <- function(string){
unlist(regmatches(string,gregexpr("[[:digit:]]+\\.*[[:digit:]]*",string)))
}
row_1 = "Small size From 3 mm long when unfed to 9 mm when fully engorged"
p<-as.numeric(Numb_Extract (row_1))
print(p)
答案 1 :(得分:0)
使用regmatches/gregexpr
。
regmatches(size, gregexpr("[[:digit:]]+[[:space:]]mm", size))
#[[1]]
#[1] "3 mm" "9 mm"
#
#[[2]]
#[1] "4 mm"
#
#[[3]]
#[1] "10 mm"
如果需要向量,请unlist
结果。
size_n <- regmatches(size, gregexpr("[[:digit:]]+[[:space:]]mm", size))
unlist(size_n)
#[1] "3 mm" "9 mm" "4 mm" "10 mm"