我正在尝试从字符串中提取金额。
我的文字如下:
data[4,2]
"Pay $500, $100 damages, $400 gas, $250 fee, $50 fees, and $2.50 late fee, 8 days late"
我正在尝试使其看起来像这样(不包括8):
data
Person Fine1 Fine2 Fine3 Fine4 Fine5 Fine6
4 500 100 400 250 50 2.50
我的代码当前如下所示:
str_extract(data[4,2], "(?<=$)(\\d|.){2,7}(?=\\s)")
但是它会产生NA。
我在做什么错了?
答案 0 :(得分:3)
尝试一下
> str_extract_all(string, "\\d+(\\.\\d+)?")[[1]]
[1] "500" "100" "400" "250" "50" "2.50"
使用R base
> strsplit(trimws(gsub("[^[:digit:]. ]", "", string)), "\\s+")[[1]]
[1] "500" "100" "400" "250" "50" "2.50"
如果您的字符串中包含不是金额的其他数字,例如以下示例:
string <- "Pay $500 and $2.50 late fee. Pay $200 for 3 cats and buy 3 apples"
,而您只想提取价格,则可以使用此代码:
> library(stringr)
> library(magrittr)
> string %>% str_extract_all(., "\\$\\s*\\d+(\\.\\d+)?") %>%
unlist %>%
gsub("\\$", "", .)
[1] "500" "2.50" "200"