Question

我正在尝试使用stringr从文本段落中提取匹配的片段 - 其中一个文本是

if returnValue is not null then 
1. if  instrument type is "Bond" then 
      Status is equals to 138 if the instrument is sensible coupon, 
      coupon type is not null and not equals to "ZERO COUPON" and previous value 
      is not equals to current value, and iinstrument creation date is not D 
- Status is equals to 137 if the instrument is sensible bbg, previous value 
      is not equals to current value, and iinstrument creation date is not D or D-1
- Status is equals  to the previous status if  the value is not manual 
        and previous status is 138, or 137

2. if attribute SEC_PAYT_DTE is not null then 
    if attribute SEC_PAYT_DTE (typed as date) is fresher than 
        returnValue (typed as date) then 
    set status to 136 that is "Functional Error"
3. if acrual date (DEBT_STRT_ACRL_DTE) is not null  and instrument 
        category is "Structured Product", and acrual date is different 
        frorm return value then 
  set status to 150 that is "Non blocking functional error".

我要提取的是“状态138”，“状态137”，“状态136”，“状态150”。

我所做的是str_extract_all（x，'（S | s）tatus [a-z \ s] {1,10} [0-9] {1,3} [^ \。]'）。但它不起作用。

Answer 1

str_extract_all中的正则表达式匹配使用POSIX标准，该标准不会继续查找新行，因此您需要自己执行此操作。

matches <- sapply(strsplit(val, "\n")[[1]],
  str_extract_all, "[Ss]tatus is(?: equals to)? [0-9]+")
matches <- gsub(fixed = TRUE, "is ", "", gsub(fixed = TRUE, " equals to", "",
  Filter(length, matches)))
# [1] "Status 138" "Status 137" "status 138"

匹配R中的正则表达式

1 个答案: