Python3-文本文件中的增量数

时间:2019-05-08 08:12:12

标签: python python-3.x

我下面有一个文本文件,其结构如下:word count

product 5
order 4
tracking 1

这意味着在输入文档中product次发现了5一词。

我有一个名为WordFrequency.py的脚本,该脚本用于查找单词以及它们在输入文件中的次数:

import re
from collections import Counter

def count_words(file_path):
    with open("/Users/oliverbusk/Sites/Sandbox/storage/app/" + file_path, 'r', encoding="utf-8") as f:

        matches = re.findall(r'\b[a-zA-Z]{3,}\b', f.read())

        wordcount = Counter(matches)

        for word in wordcount:
            string = word + " " + str(wordcount[word])
            write_to_file(string)

def write_to_file(word):
    with open("/Dictionaries/eng.txt", "a+") as f:
        f.write(word + "\n")

因此,基本上,上面的代码将读取输入文件file_path,并将单词和计数添加到eng.txt

但是,每当我运行它时,结果都将被附加到eng.txt文件中,例如:

product 5
order 4
tracking 1
product 5
order 4
tracking 1

相反,如果count文件中已经存在该单词,我希望它增加eng.txt

1 个答案:

答案 0 :(得分:1)

一种方法是先读取文件的内容,然后增加计数。

例如:

import re
from collections import Counter, defaultdict

def count_words():
    #Read Content#
    with open("/Dictionaries/eng.txt", "r") as f:
        data = defaultdict(int)
        for line in f:
            key, value = line.strip().split()
            data[key] = int(value)

    with open("/Users/oliverbusk/Sites/Sandbox/storage/app/" + file_path, 'r', encoding="utf-8") as f:
        matches = re.findall(r'\b[a-zA-Z]{3,}\b', f.read())
        wordcount = Counter(matches)
        for word, count in wordcount.items():
            data[word] += count                 #Increment Count

    #Write To File
    write_to_file(data)

def write_to_file(data):
    with open("/Dictionaries/eng.txt", "w") as f:
        for word, count in data.items():
            string = word + " " + str(count)
            f.write(string + "\n")