如何计算一个文件中字符串出现的次数并将其附加到另一个文件中

时间:2019-02-12 04:21:02

标签: python file count

我需要计算.txt文件中“产品ID”的出现次数,并让它打印该文件中的数字。我是python的新手,正在尝试把我的头缠住。我在代码中将其单独工作,但是在运行程序后,它会将数字打印到命令行(因此打印)。我尝试使用print(count)>>“ hardDriveSummary.txt文件”,并打印>> count,“ hardDriveSummary.txt文件”,但无法正常工作。

# Read .xml file and putlines row_name and Product ID into new .txt file
search = 'row_name', 'Product ID'

#source file
with open('20190211-131516_chris_Hard_Drive_Order.xml') as f1:
    #output file
    with open('hardDriveSummary.txt', 'wt') as f2:
        lines = f1.readlines()
        for i, line in enumerate(lines):
            if line.startswith(search):
                f2.write("\n" + line)

#count how many occurances of 'Product ID' in .txt file
def main():

    file  = open('hardDriveSummary.txt', 'r').read()
    team  = "Product ID"
    count = file.count(team)

    print(count)

main()

hardDriveSummary.txt的示例:

Name          Country 1

Product ID                      : 600GB

Name         Country 2

Product ID                      : 600GB

Name           Country 1

Product ID                      : 450GB

.xml文件的内容:

************* Server Summary *************

Server                      serv01
label                         R720
asset_no                   CNT3NW1
Name                     Country 1
name.1                       City1
Unnamed: 6                     NaN

************* Drive Summary **************

ID                              : 0:1:0
State                           : Failed
Product ID                      : 600GB
Serial No.                      : 6SL5KF5G


************* Server Summary *************

Server                      serv02
label                         R720
asset_no                   BZYGT03
Name                     Country 2
name.1                       City2
Unnamed: 6                     NaN

************* Drive Summary **************

ID                              : 0:1:0
State                           : Failed
Product ID                      : 600GB
Serial No.                      : 6SL5K75G


************* Server Summary *************

Server                      serv03
label                         R720
asset_no                   5GT4N51
Name                     Country 1
name.1                       City1  
Unnamed: 6                     NaN

************* Drive Summary **************

ID                              : 0:1:0
State                           : Failed
Product ID                      : 450GB
Serial No.                      : 6S55K5MG

2 个答案:

答案 0 :(得分:2)

如果您只想将计数器值标记到文件末尾,则下面的代码应该起作用:

import os

def main():   
    with open('hardDriveSummary.txt', 'ab+') as f:
        term = "Product ID"
        count = f.read().count(term)
        f.seek(os.SEEK_END)  # Because we've already read the entire file. Go to the end before writing otherwise we get an IOError
        f.write('\n'+str(count))

答案 1 :(得分:0)

由于Product ID是两个不同的单词,因此将整个文本分为两个单词组,以下代码将为您提供预期的结果:

from collections import Counter
f = open(r"sample.py", "r")
words = f.read().split()
bigrams = zip(words, words[1:])
counts = Counter(bigrams)
data = {' '.join(k): v for k, v in dict(counts).items()}
if 'Product ID' in data:
    print('Count of "Product ID": ', data['Product ID'])