这是我的模式:
pattern_1a = re.compile(r"(?:```|\n)Item *1A\.?.{0,50}Risk Factors.*?(?:\n)Item *1B(?!u)", flags = re.I|re.S)
为什么它与以下文本不匹配?怎么了?
"""
Item 1A.
Risk
Factors
If we
are unable to commercialize
ADVEXIN
therapy in various markets for multiple indications,
particularly for the treatment of recurrent head and neck
cancer, our business will be harmed.
under which we may perform research and development services for
them in the future.
42
Table of Contents
We believe the foregoing transactions with insiders were and are
in our best interests and the best interests of our
stockholders. However, the transactions may cause conflicts of
interest with respect to those insiders.
Item 1B.
"""
答案 0 :(得分:1)
这是将实际文本数学化的一种解决方案。将(
)
放在您的字符串周围,将解决很多问题。请参阅下面的解决方案。
pattern_1a = re.compile(r"(?:```|\n)(Item 1A)[.\n]{0,50}(Risk Factors)([\n]|.)*(\nItem 1B.)(?!u)", flags = re.I|re.S)
匹配证据: https://regexr.com/41ejq
答案 1 :(得分:0)
问题在于风险因素分布在两行上。实际上是:风险\ n因素
使用一般的空格\ s或换行\ n代替空格来匹配文本。