python阅读文本文件并查找包含两个特定单词的段落

时间:2017-07-23 19:35:26

标签: python regex text find

我有一个如下所示的段落列表,我想在同一段落中提取包含两个特定单词的段落。

["   Electronically monitored security systems are tailored to our customers' specific needs and involve the installation and use on a\ncustomer's premises of devices designed for intrusion detection and access control, as well as reaction to various occurrences or\nconditions, such as movement, fire, smoke, flooding, environmental conditions, industrial processes and other hazards. These\ndetection devices are connected to microprocessor-based control panels, which communicate to a monitoring center (located remotely\nfrom the customer's premises) where alarm and supervisory signals are received and recorded. In most systems, control panels can\nidentify the nature of the alarm and the areas within a building where the sensor was activated. Depending upon the type of service\nfor which the subscriber has contracted, monitoring center personnel respond to alarms by relaying appropriate information to the\nlocal fire or police departments, notifying the customer or taking other appropriate action, such as dispatching employees to the\ncustomer's premises. In some instances, the customer may monitor the system at its own premises or the system may be connected to\nlocal fire or police departments.", "   Whether systems are monitored by the customer at its premises or connected to one of our monitoring centers, we usually provide\nsupport and maintenance through service contracts. Systems installed at customers' premises may be owned by us or by our customers.",'   We market our electronic security services to commercial and residential customers through both a direct sales force and an\nauthorized dealer network. A separate national accounts sales force services large commercial customers. We also utilize advertising\nand direct mail to market our services.','   We provide residential electronic security services primarily in North America, Europe and South Africa, with a growing presence\nin the Asia-Pacific region. Our commercial customers include financial institutions, industrial and commercial businesses, federal,\nstate and local governments, defense installations, and health care and educational facilities. Our customers are often prompted to\npurchase security systems by their insurance carriers, which may offer lower insurance premium rates if a security system is\ninstalled or require that a system be installed as a condition to coverage. It has been our experience that the majority of\ncommercial and residential monitoring contracts are renewed after their initial terms. In general, relocations account for the\nlargest number of residential discontinuances while business closures comprise the largest single factor impacting commercial\ncontract attrition.', "   We are the leader in anti-theft systems. The majority of the world's leading retailers use our systems to protect against\nshoplifting and employee theft. We manufacture these SENSORMATIC electronic article surveillance systems and generally sell them\nthrough our direct sales force in North and South America, Europe, Australia, Asia and South Africa. A growing trend in the loss\nprotection"]

我写了下面的代码,但它没有给出我想要的东西。它给出了包含客户或系统的段落。

for pg in paragraphs:
    pfls = []
    pg= pg.replace("\n", ' ')
    if 'customers' in pg and 'system' in pg:
    print('1',pg)

我的代码出了什么问题?

1 个答案:

答案 0 :(得分:1)

在python中,缩进很重要。缩进描述代码块。 在你的例子中,最后三行没有正确缩进,只在循环结束时执行,只测试最后一段。

for pg in paragraphs:
    pfls = []

    pg= pg.replace("\n", ' ')
    if 'customers' in pg and 'system ' in pg:
        print('1',pg)

将达到你想要的效果。

编辑:这样做的一种肮脏方式是通过在测试中的单词之后添加空格来隔离单词system,如下所示:

for pg in paragraphs:
    pfls = []

    pg= pg.replace("\n", ' ')
    if 'customers' in pg and 'system ' in pg:
        print('1',pg)

实现这一目标的更好方法是使用正则表达式:

import re

systemPattern = re.compile(r"\bsystem\b")
customersPattern = re.compile(r"\bcustomers\b")

for pg in paragraphs:
    pfls = []

    pg= pg.replace("\n", ' ')
    if systemPattern.search(pg) and customersPattern.search(pg):
        print('1',pg)