我正在尝试使用stringr从文本段落中提取匹配的片段 - 其中一个文本是
if returnValue is not null then
1. if instrument type is "Bond" then
Status is equals to 138 if the instrument is sensible coupon,
coupon type is not null and not equals to "ZERO COUPON" and previous value
is not equals to current value, and iinstrument creation date is not D
- Status is equals to 137 if the instrument is sensible bbg, previous value
is not equals to current value, and iinstrument creation date is not D or D-1
- Status is equals to the previous status if the value is not manual
and previous status is 138, or 137
2. if attribute SEC_PAYT_DTE is not null then
if attribute SEC_PAYT_DTE (typed as date) is fresher than
returnValue (typed as date) then
set status to 136 that is "Functional Error"
3. if acrual date (DEBT_STRT_ACRL_DTE) is not null and instrument
category is "Structured Product", and acrual date is different
frorm return value then
set status to 150 that is "Non blocking functional error".
我要提取的是“状态138”,“状态137”,“状态136”,“状态150”。
我所做的是str_extract_all(x,'(S | s)tatus [a-z \ s] {1,10} [0-9] {1,3} [^ \。]')。但它不起作用。
答案 0 :(得分:0)
str_extract_all
中的正则表达式匹配使用POSIX标准,该标准不会继续查找新行,因此您需要自己执行此操作。
matches <- sapply(strsplit(val, "\n")[[1]],
str_extract_all, "[Ss]tatus is(?: equals to)? [0-9]+")
matches <- gsub(fixed = TRUE, "is ", "", gsub(fixed = TRUE, " equals to", "",
Filter(length, matches)))
# [1] "Status 138" "Status 137" "status 138"