Python正则表达式在文本文件的中间找到特定的单词

时间:2016-06-21 08:14:59

标签: python regex grouping

我基本上有一个文本文件,我想搜索一个句子的中间单词。我运行found_state not defined脚本时收到错误.py

考虑这个文件:

file.conf
hostname(config)#aaa new-model
fdfsfd b
kthik
pooooo
shh

我的python脚本如下:

import re;    
import time;

with open('file.conf') as f:
    content = f.readlines()
name=''

for data in content:
    if re.search('(?<=#)\w+',data):
        found_state=1
        name=data
        break
if found_state==1:
    print name + "is Found"
else:
    print "NF"

2 个答案:

答案 0 :(得分:0)

如果条件if re.search('(?<=#)\w+',data):失败,则不声明found_state。在for循环之前执行此操作。

答案 1 :(得分:0)

既然你说你需要得到“中间词”我理解你需要提取那个词。现在,如果有匹配,你会得到整条线。

以下a piece of code适用于您(打印aaa is Found):

import re;
content = ["hostname(config)#aaa new-model", "fdfsfd b", "kthik", "pooooo", "shh"] # <= TEST DATA
name=''
found_state = 0                       # Declare found_state
for data in content:
    m = re.search(r'#(\w+)',data)     # Use a raw string literal and a capturing group
    if m:                             # Check if there was a match and if yes
        found_state=1                 #   - increment found_state
        name=m.group(1)               #   - get the word after #
        break
if found_state==1:
    print name + " is Found"
else:
    print "NF"

但是,或许,您可能希望将代码缩减为

res = []
for data in content:
    res.extend(re.findall(r'#(\w+)', data))
print(res)

this demo#(\w+)模式将在#之后捕获单词字符(1或更多),并且仅返回这些捕获的子字符串,extend将所有字​​符串添加到列表中。