如何获取文本中带有小数或破折号的数字?

时间:2018-06-21 17:37:23

标签: r regex string

嗨,我有一个小问题?

我有一个文档,其文本类似于:

with open ("transfer.csv", 'wb') as csvfile:
       swriter=csv.writer(csvfile, delimiter=',') 
       swriter.writerow(['visual','auditory','response','instrument']) 
        swriter.writerows([trans])
        swriter.writerows([trans2])

我想退货:

Hi, my name is John Doe and I would like a new xds 6543.21-M for blah blah 
blah. I would also like hre 350-M for blah blah blah.

2 个答案:

答案 0 :(得分:0)

我们可以分割字符串,然后使用grep来检索带有数字的元素。 lapply可以使我们在列表上应用grep函数。

vec <- c("Hi, my name is John Doe and I would like a new xds 6543.21-M for blah blah", 
         "blah. I would also like hre 350-M for blah blah blah.")

list1 <- strsplit(vec, split = " ")
list2 <- lapply(list1, function(x) grep("[0-9]+", x, value = TRUE))
list2
# [[1]]
# [1] "6543.21-M"
# 
# [[2]]
# [1] "350-M"

答案 1 :(得分:0)

您可以充分利用stringrstr_extract_all函数:

x <- "Hi, my name is John Doe and I would like a new xds 6543.21-M for blah blah
blah. I would also like hre 350-M for blah blah blah."
#install.packages("stringr")
library(stringr)
str_extract_all(x, '[0-9]+(?:\\.[0-9]+)?-M')
#[1] "6543.21-M" "350-M"