循环任务通过所有输入文件python

时间:2017-08-16 15:27:53

标签: python

我正在尝试计算我提供的所有.txt文件中的所有As和Bs和Cs,并制作一个.csv文件,逐个列出所有这些字母的计数。

这里的代码完成了我想要的所有代码,但只提供了我提供的最后一个文件而不是所有文件。

我做错了什么?

import glob
import csv

#This will print out all files loaded in  the same directory and print them out
for filename in glob.glob('*.txt*'):
    print(filename)

#A B and C
substringA = "A"
Head1 = (open(filename, 'r').read().count(substringA))
substringB = "B"
Head2 = (open(filename, 'r').read().count(substringB))
substringC = "C"
Head3 = (open(filename, 'r').read().count(substringC))
header = ("File", "A Counts" ,"B Counts" ,"C Counts")
analyzed = (filename, Head1, Head2, Head3)

#This will write a file named Analyzed.csv
with open('Analyzed.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(header)
    writer.writerow(analyzed)

3 个答案:

答案 0 :(得分:2)

缺少缩进,并在附加模式randlines () { fname=$1 nlines=$2 shuf -n "$nlines" -i 1-$(wc -l < "$fname") | sort -n | sed 's/$/p/;$s/p/{&;q}/' | sed -n -f - "$fname" } 中打开randlines file1 4

Analyzed.csv

编辑:删除了不受支持的a参数

答案 1 :(得分:1)

还需要进行另一项小改动:你需要打开asnd,而不是写,以及缩进。请注意,当您以追加形式打开时,您不会覆盖之前的任何内容,因此我添加了顶部的部分以删除csv中已有的任何内容。

import glob
import csv


#This will delete anything in Analzyed.csv if it exists and replace it with the header
with open('Analyzed.csv','w') as csvfile:
    writer = csv.writer(csvfile)
    header = ("File", "A Counts" ,"B Counts" ,"C Counts")
    writer.writerow(header)

for filename in glob.glob('*.txt*'):
    print(filename)

    #A B and C
    substringA = "A"
    Head1 = (open(filename, 'r').read().count(substringA))
    substringB = "B"
    Head2 = (open(filename, 'r').read().count(substringB))
    substringC = "C"
    Head3 = (open(filename, 'r').read().count(substringC))
    header = ("File", "A Counts" ,"B Counts" ,"C Counts")
    analyzed = (filename, Head1, Head2, Head3)

    #This will write a file named Analyzed.csv
    with open('Analyzed.csv', 'a', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(analyzed)

以上是我的解决方案,尽可能保持您的代码不受影响。但理想情况下,您只能在文件开头打开一次文件。这就是你要这样做的方式:

import glob
import csv


with open('Analyzed.csv','w') as csvfile:
    writer = csv.writer(csvfile)
    header = ("File", "A Counts" ,"B Counts" ,"C Counts")
    writer.writerow(header)

    for filename in glob.glob('*.txt*'):
        print(filename)

        #A B and C
        substringA = "A"
        Head1 = (open(filename, 'r').read().count(substringA))
        substringB = "B"
        Head2 = (open(filename, 'r').read().count(substringB))
        substringC = "C"
        Head3 = (open(filename, 'r').read().count(substringC))
        analyzed = (filename, Head1, Head2, Head3)

        writer.writerow(analyzed)

答案 2 :(得分:0)

你可以试试这个:

from itertools import chain
from collections import Counter
for filename in glob.glob('*.txt*'):
     data = chain.from_iterable([list(i.strip("\n")) for i in open(filename)])

     the_count = Counter(data)
     with open('Analyzed.csv', 'w', newline='') as csvfile:
         writer = csv.writer(csvfile)
         writer.writerow(filename)
         writer.writerow("A count: {}".format(the_count["A"]))
         writer.writerow("B count: {}".format(the_count["B"]))
         writer.writerow("C count: {}".format(the_count["C"]))