带有循环结果的文本文件生成

时间:2016-01-30 11:30:40

标签: python

我有一个包含32篇文章的文本文件。我设法使用以下代码查找每篇文章:

import re 
sections = [] 
current = []
with open("Aberdeen2005.txt") as f:
    for line in f:
        if re.search(r"(?i)\d+ of \d+ DOCUMENTS", line):        
           sections.append("".join(current))
           current = [line]
        else:
           current.append(line)

print(len(sections)) 

我接下来要做的是查看有多少文章包含我感兴趣的关键字:税收和政策。在这一行中,如果文章有它我提取月份:

months=['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'Novemeber', 'December']


for i in range(len(sections)): 

    if (' tax ' in sections[i]
    or ' Tax ' in sections[i]
    or ' policy ' in sections[i]
    or ' Policy ' in sections[i]):

        pat=re.compile("|".join([r"\b{}\b".format(m) for m in months]), re.M)
        month = pat.search("\n".join(sections[i].splitlines()[0:6]))
        print(month)

最后但并非最不重要的是,我想创建一个包含之前找到的月份的文本文件:

outfile = open('C:/Users/nn/Desktop/Uncertainty_Scot/dates.txt', 'w')
outfile.write(month.group(0))
outfile.close

问题出在哪里,它只产生上个月。我想是因为它不在循环中,任何想法怎么做?

亲切的问候!

1 个答案:

答案 0 :(得分:1)

您只需将循环包装在输出文件的 HashMap<String, String> values = new HashMap<String, String>(); values.put("Test01", "Test"); values.put("Test02", "Test2"); sendRequest("http://TheWebSite/script.php", values); 循环中,如下所示:

with

您可以通过执行以下操作来进一步改善您的循环:

months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']

with open(r'C:\Users\nn\Desktop\Uncertainty_Scot\dates.txt', 'w') as outfile:
    for i in range(len(sections)): 
        if (' tax ' in sections[i] or ' Tax ' in sections[i] or ' policy ' in sections[i] or ' Policy ' in sections[i]):
            pat = re.compile("|".join([r"\b{}\b".format(m) for m in months]), re.M)
            month = pat.search("\n".join(sections[i].splitlines()[0:6]))
            print(month)
            outfile.write(month.group(0))

首先转换为小写,您只需要测试一个版本的字符串,然后它也会捕获months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'] with open('C:/Users/nn/Desktop/Uncertainty_Scot/dates.txt', 'w') as outfile: for s in sections: if any(x in s.lower() for x in [' tax ', ' policy ']: pat = re.compile("|".join([r"\b{}\b".format(m) for m in months]), re.M) month = pat.search("\n".join(s.splitlines()[0:6])) print(month) outfile.write(month.group(0)) 形式的条目。