如何使用python读取特定文件夹中的大量txt文件

时间:2016-10-27 08:13:58

标签: python python-2.7

请帮帮我,我在文件夹中有一些文件txt。我想阅读并汇总所有数据成为一个文件txt。我怎么能用python做到这一点。 例如:

import java.util.*;
import java.lang.*;
import java.io.*;

class Ideone
{
    public static void main (String[] args) throws java.lang.Exception
    {
        System.out.println(
            Arrays.toString(
            splitSentence(
            "Random sentences can also spur creativity in other types of projects being done. If you are trying to come up with a new concept, a new idea or a new product, a random sentence may help you find unique qualities you may not have considered. Trying to incorporate the sentence into your project can help you look at it in different and unexpected ways than you would normally on your own."
            ,10)));
    }

    public static String[] splitSentence(String sentence, int amount){
        String[] words = sentence.split(" ");
        int arraySize = (int)Math.ceil((double)words.length/amount);
        String[] output = new String[arraySize];
        int index=0;
        int fullLines = (int)Math.floor((double)words.length/amount);

        for(int i=0; i<fullLines; i++){
            String appender = "";
            for(int j=0; j<amount; j++){
                appender += words[index]+" ";
                index++;
            }
            output[i] = appender;
        }
        String appender = "";
        for(int i=index; i<words.length; i++){
            appender += words[index]+" ";
            index++;
        }
        output[fullLines] = appender;
        return output;
    }

}

我的预期输出:

folder name : data
file name in that folder : log1.txt
                           log2.txt
                           log3.txt
                           log4.txt
data in log1.txt : Size:         1,116,116,306 bytes
data in log2.txt : Size:         1,116,116,806 bytes
data in log3.txt : Size:         1,457,116,806 bytes
data in log4.txt : Size:         1,457,345,000 bytes

5 个答案:

答案 0 :(得分:2)

您是不是想要阅读每个文件的内容并将其全部写入不同的文件。

import os
#returns the names of the files in the directory data as a list
list_of_files = os.listdir("data")
lines=[]
for file in list_of_files:
    f = open(file, "r")
    #append each line in the file to a list
    lines.append(f.readlines())
    f.close()

#write the files to result.txt
result = open("result.txt", "w")
result.writelines(lines)
result.close()

如果您要查找文件大小而不是内容。 改变两行:

 f= open(file,"r")
lines.append(f.readlines())

为:

lines.append(os.stat(file).st_size)

答案 1 :(得分:1)

档案concat.py

#!/usr/bin/env python
import sys, os

def main():
    folder = sys.argv[1] # argument contains path
    with open('result.txt', 'w') as result: # result file will be in current working directory
        for path in os.walk(folder).next()[2]: # list all files in provided path
            with open(os.path.join(folder, path), 'r') as source:
                result.write(source.read()) # write to result eachi file

main()

用法concat.py <your path>

答案 2 :(得分:0)

  1. 您必须找到您要阅读的所有文件:

    path = "data"
    files = os.listdir(path)
    
  2. 您必须阅读所有文件,并为每个文件收集尺寸和内容:

    all_sz = {i:os.path.getsize(path+'/'+i) for i in files}
    all_data = ''.join([open(path+'/'+i).read() for i in files])
    
  3. 您需要格式化的打印件:

    msg = 'this is ...;' 
    sp2 = ' '*4
    sp = ' '*len(msg) + sp2
    print msg + sp2,
    for i in all_sz:
        print sp, "{:,}".format(all_sz[i])
    

答案 3 :(得分:0)

导入os。然后使用os.listdir('data')列出文件夹内容并将其存储在数组中。对于每个条目,您可以通过调用os.stat(entry).st_size来获取大小。现在可以将这些条目中的每一个写入文件。

组合:

import os

outfile = open('result.txt', 'w')
path = 'data'
files = os.listdir(path)
for file in files:
    outfile.write(str(os.stat(path + "/" + file).st_size) + '\n')

outfile.close()

答案 4 :(得分:0)

如果需要合并已排序的文件,以便输出文件也被排序, 他们可以使用merge标准库模块中的heapq方法。

from heapq import merge
from os import listdir

files = [open(f) for f in listdir(path)]
with open(outfile, 'w') as out:
    for rec in merge(*files):
        out.write(rec)

记录按词汇顺序排序,如果需要不同的merge接受key=...可选参数来指定不同的排序函数。