如何搜索文件,计算命中数并将数字写入.csv中的B列?

时间:2014-04-16 10:46:38

标签: python search csv count

这是我的脚本。它当前找到 path 中包含* cycle * .log 的所有文件,然后查找包含&#34的那些文件中的所有行;超时"并将它们转换为outfilecamera以及它在其中找到的文件的名称。

i = 0

ii = "\n"

for i in range(0,listlength):
    path = pathlist[i].strip()
    outfilecamera = join((path), 'cameratimeouts.txt')   
    os.chdir(path)
    for path in glob.glob("*cycle*.log"):
        with open(path) as f_in, open(outfilecamera, 'a') as f_out:
            f_out.writelines(path)
            f_out.writelines(ii)
            f_out.writelines(line for line in f_in if "timeout of" in line)

我想要做的还是COUNT在文件中找到匹配的次数,并将该数字粘贴到csv文件的B列中。即每行将是每个文件中的命中数。 A列理想地等于 i

我一直在寻找年龄,并且找不到计数功能!?

感谢我们的帮助!

3 个答案:

答案 0 :(得分:1)

此处隐藏的次数:

f_out.writelines(line for line in f_in if "timeout of" in line)

所以,你所要做的就是首先使用生成器,比如列表:

matched_lines = list(line for line in f_in if "timeout of" in line)
f_out.writelines(matched_lines)

接下来,只需收集每个文件的匹配数。在循环顶部创建空白列表,然后在循环内部添加文件名和计数:

file_counts = []

# .. your loop starts

    matched_lines = list(line for line in f_in if "timeout of" in line)
    f_out.writelines(matched_lines)
    file_counts.append((os.path.basename(path),len(matched_lines)))

完成文件处理后:

with open('results.csv','w') as f:
    writer = csv.writer(f, delimiter=",")
    writer.writerow(['File Name','Count'])
    writer.writerows(file_counts)

答案 1 :(得分:0)

这是你需要的......

import csv
import glob

searched = 'timeout of'
with open('output.csv', 'wb') as csvfile:
    cwriter = csv.writer(csvfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
    cwriter.writerow(['File', 'Number'])
    for path in glob.glob("*cycle*.log"):
        with open(path) as f_in:
            n = 0
            for line in f_in.readlines():
                if searched in line:
                    n += 1
            cwriter.writerow([path, n])

答案 2 :(得分:0)

grep怎么样?这项任务非常简单。

grep -c 'timeout of' *cycle*.log

对于类似输出的csv,您需要将冒号替换为逗号:

grep -c 'timeout of' *cycle*.log | sed 's/:/,/'

并将结果放入cameratimeouts.txt文件中:

grep -c 'timeout of' *cycle*.log | sed 's/:/,/' >cameratimeouts.txt

如果你坚持使用python,我的解决方案将是:

for i in range(0,listlength):
    path = pathlist[i].strip()
    outfilecamera = join((path), 'cameratimeouts.txt')   
    os.chdir(path)
    for path in glob.glob("*cycle*.log"):
        with open(path) as f_in, open(outfilecamera, 'a') as f_out:
            f_out.write('%s,%d\n' % (path, sum(l.count('timeout of') for l in f_in)))