Question

我正在使用家用工具来分析计算机配置，以验证是否应用了某些基本配置，如果未应用，则它将在运行该工具的主机上的文本文件中生成警报。

该工具不会在无法正常工作的计算机上创建文件，而是为所有人创建文件。

我想解析此文本文件，并获取与每台计算机相对应的每个段落，以向IT部门发送电子邮件，IT负责计算机，告诉他他必须做什么。

例如以下示例：

---- mycomputerone ---- 

 Hello

 During Test of mycomputerone following misconfiguration were detected
 - bad ip adress
 - bad name

 please could take the action to correct it and come back to us?

 ---- mycomputertwo ---- 

 Hello

 During Test of mycomputertwo following misconfiguration were detected
 - bad ip adress
 - bad name
 - administrative share available

 please could take the action to correct it and come back to us?

 ---- mycomputerthree ---- 
.....

我想获取hello和?之间的文本，但无法管理该方法

我尝试了

re.search(r'hello'(S*\w+)\?'), text)

它没有用。我通过

读取了文件

d = open(file, 'r'; encoding="UTF-8") 
text = d.read()

Answer 1

您要的是

re.findall(r'(?m)^\s*Hello\s*[^?]+', d)

其中d是作为单个字符串读取的整个文件。参见this demo。如果内容包含?，它将无法正常工作。

我建议一行一行地阅读，检查一行是否以---开头，然后将后续的行添加到当前记录中。

请参阅以下Python demo：

items = []
tmp = ''
with open(file, 'r'; encoding="UTF-8") as d:
for line in d:
    if (line.strip().startswith('---')):
        if tmp:
            items.append(tmp.strip())
            tmp = ''
    else:
        tmp = tmp + line + "\n"
if tmp:
    items.append(tmp)

print(items)

输出：

['Hello\n\n During Test of mycomputerone following misconfiguration were detected\n - bad ip adress\n - bad name\n\n please could take the action to correct it and come back to us?', 
 'Hello\n\n During Test of mycomputertwo following misconfiguration were detected\n - bad ip adress\n - bad name\n - administrative share available\n\n please could take the action to correct it and come back to us?']

python解析文件并提取段落

1 个答案: