如何在python中停止打印重复

时间:2015-11-16 05:04:47

标签: regex python-2.7

我一直在研究一个涉及使用列表中的关键字的代码。当从段落中的字符串中搜索和匹配关键字时,代码将使用关键字打印出字符串。我无法找到的解决方案是打印出字符串而不重复它。 这是代码

import re
from random import randint

def foo():
    List1 = ['Risk','ocp','cancer','menarche','estrogen','nulliparity',]
    txt = " Risk factors for breast cancer have been well characterized. Factors associated with an increased exposure to estrogen have also been elucidated including early menarche, late menopause, later age at first pregnancy, or nulliparity."
    words = txt
    matches = []
    sentences = re.split(r'\.', txt)

    k = iter(List1)
    while True:
        try:
            keyword1 = next(k)
        except StopIteration:
            break    
        pattern = keyword1 
        re.compile(pattern)

        for sentence in sentences:
            if re.search(pattern, sentence):
                matches.append(sentence)           

                for match in matches:
                    print("Sentence matching the word (" + keyword1 + "):")##just to checkfor keyword matching
                    print (match)
        break

foo()

尽管使用'break',我仍然可以看到这里。这可以做得更好。

>> Sentence matching the word ("Risk"): " Risk factors for breast cancer have been well characterized. 
>> Sentence matching the word ("Risk"): " Risk factors for breast cancer have been well characterized.
>> Sentence matching the word ("Risk"): " Risk factors for breast cancer have been well characterized.

有时我会使用关键字,但句子错误

>> Sentence matching the word ("ocp"): " Risk factors for breast cancer have been well characterized.

1 个答案:

答案 0 :(得分:2)

逻辑有点复杂,没有充分的理由。很难描述究竟是什么问题(重复使用的变量不是你想象的那样)。此外,整个while循环可以是forre.compile()除非您使用结果,否则不会执行任何操作。

潜在的重写:

sentences = re.split(r'\.', txt)

for pattern in List1:
    for sentence in sentences:
        if pattern in sentence:
            print("Sentence matching the word (" + pattern + "):")
            print(sentence)

            # uncomment break if you want only the first matching sentence
            # break