有人可以帮忙吗? (也发布在RegexBuddy论坛上)
我有这个相对较大(自动生成)的正则表达式(在底部完整列出),并且使用此片段有许多重复的片段: -
# Add words to word list
(?<_KC1>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
这是为了舀取&#39;更知名的片段之间的文字和文字。这些捕获都会在代码中汇总,以提供整体匹配中的单词列表。
我遇到的问题是第一个备用部分,即:
# Pair of Strike prices
(?<Strike>[+|-]?\d+(?:\.\d+)?)/(?<Strike2>[+|-]?\d+(?:\.\d+)?)
# Add to Word List (but not 'x' as last word) !!!!!!!!!!!! This is what needs changing
(?<_KC3>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
# Cross price
(?:x[ \t]?-?(?<Cross>[+|-]?\d+(?:\.\d+)?)x?)?
正如你所看到的那样,&#34;交叉价格&#34;总是以&#39; x&#39;开头,所以我需要的是一个与我提到的第一个片段尽可能相似的模式,但忽略了最后一个字,如果碰巧是&#39; x&#39;。 还有两个并发症: 1)&#34;交叉价格&#34;本身是可选的 2)&#39; x&#39;本身可以匹配&#34;期货到期日&#34;作为路透社的日期代码。
我尝试过负面的看守等等,但无论我做什么,我都会把别的东西弄乱。我相信答案可能在于If-Then-Else条件,但我不确定。
举个例子: -
WTI AMERICAN:Jun12 110.00 / 140.00 [1x2]来电差价x 102.50 350 - 365
&#34;一对罢工价格&#34;正在返回&#34; 110.00 / 140.00 &#34;正如预期的那样
但是Word List正在提取&#34; [1x2]调用点差x &#34; &#34; 102.50 &#34;应该是&#34;交叉价格&#34;现在正在表达式中稍后匹配为&#34; Bid&#34; “买入/卖出价差”的一部分&#34;。
感谢任何帮助
干杯 西蒙
# Match this group (optional)
(?:
# Match one of the product symbols or their aliases
\b(?<ProductSymbol>CL|Brent|GasOil|WTI|LO|BRT)\b
# Add words to word list
(?<_KC1>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
# Skip over whitespace plus any of these characters [:]
[ \t:]+
)?
# Futures expiry date
(?<=[ \t]|'|^)(?<FuturesExpiryPeriod>(?<_MY>(?<_MYP>(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?))[ \t]?(?<_MYY>(?:20)?\d\d))|(?<_CE>Cal-?(?<_CEY>(?:20)?\d\d))|(?<_QF>Q(?:uarter)?(?<_QFP>1|2|3|4)[ \t]*(?<_QFY>(?:20)?\d\d))|(?<_QL>(?<_QLP>1|2|3|4)[ \t]*Q(?:uarter)?[ \t]*(?<_QLY>(?:20)?\d\d))|(?<_HY>(?<_HYP>1|2)[ \t]*H(?:alf)?[ \t]*(?<_HYY>(?:20)?\d\d))|(?<_ER>(?<_ERP>[FGHJKMNQUVXZ])(?<_ERY>\d{0,2}))[ \t]*)
# Skip over whitespace
[ \t]+
# Add words to word list
(?<_KC2>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
# Match one of the following choices (in order):
(?:
(?: # First choice
# Pair of Strike prices
(?<Strike>[+|-]?\d+(?:\.\d+)?)/(?<Strike2>[+|-]?\d+(?:\.\d+)?)
# Add to Word List (but not 'x' as last word) !!!!!!!!!!!! This is what needs changing
(?<_KC3>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
# Cross price
(?:x[ \t]?-?(?<Cross>[+|-]?\d+(?:\.\d+)?)x?)?
)
|
(?: # Second choice
# Cross price
(?:x[ \t]?-?(?<Cross>[+|-]?\d+(?:\.\d+)?)x?)
# Add words to word list
(?<_KC4>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
# Pair of Strike prices
(?<Strike>[+|-]?\d+(?:\.\d+)?)/(?<Strike2>[+|-]?\d+(?:\.\d+)?)?
)
|
(?: # Third choice
# Single Strike price
(?<Strike>[+|-]?\d+(?:\.\d+)?)
# Add to Word List (but not 'x' as last word) !!!!!!!!!!!! This is what needs changing
(?<_KC5>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
# Cross price
(?:x[ \t]?-?(?<Cross>[+|-]?\d+(?:\.\d+)?)x?)?
)
|
(?: # Fourth choice
# Cross price
(?:x[ \t]?-?(?<Cross>[+|-]?\d+(?:\.\d+)?)x?)
# Add words to word list
(?<_KC6>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
# Single Strike price
(?<Strike>[+|-]?\d+(?:\.\d+)?)?
)
)
# Add words to word list
(?<_KC7>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
# Skip over whitespace plus any of these characters [,]
[ \t,]+
# Bid/Offer spread
(?<Bid>[+|-]?\d+(?:\.\d+)?)[ \t]*(?:/|-|\ )[ \t]*(?<Offer>[+|-]?\d+(?:\.\d+)?)
# Look for any other keywords in brackets (optional)
(?:
# Skip over whitespace
[ \t]*
# <pattern>
\(
# Add words to word list
(?<_KC8>(?:(?:\w|[ \t\\/]|\[\w*\])*?))
# <pattern>
\)
)?
答案 0 :(得分:0)
如果您要从文件或其他内容中读取内容,请更好地使用awk等工具进行解析。不要选择复杂的正则表达式程序,因为它们可能会在一些不太预期的场景中引起问题。 干杯!