Question

我有一系列句子，在包含几个其他句子的大文本中包含大写关键字。我只需匹配那些包含大写单词（1或更多）的句子，例如：

This is MY SENTENCE that should be matched.
And THIS one should be too.
This other sentence should not be matched.

有什么建议吗？谢谢！我不是高级用户......

Answer 1

尝试使用https://regexr.com/等工具。它们真的有助于可视化你的正则表达式有哪些效果。

对于你的testdata，这个正则表达式很好：

([^\.]*[A-Z]{2,}[^\.]*)\.

由

组成

[^\.]*任何没有点的内容
[A-Z]{2,}至少2个大写字符
[^\.]*任何没有点的内容

Answer 2

就是这样：

^.*\b[A-Z]+\b.*$

\ b在字边界处断言位置
A-Z A（索引65）和Z

https://regex101.com/r/kUN41W/1

如果I不计入符合条件的句子中的大写单词。然后使用：

^.*\b[A-Z]{2,}\b.*$

{2，}量词 - 在2和无限次之间匹配，尽可能多尽可能多的时间，根据需要回馈

Answer 3

使用Python

import re

txt = 'This is MY SENTENCE and I would like, this sentence, to be matched because it contains uppercase words. This other sentence should not be matched. And THIS one should be.' 

for s in txt.split('.'):
    if re.search(r'\b[A-Z]+\b', s): 
        print(s)

输出：

This is MY SENTENCE and I would like, this sentence, to be matched because it contains uppercase words
 And THIS one should be

正则表达式：如何匹配大写序列之前和之后的任何内容，并以句点作为分隔符？

3 个答案: