将多个列表写入多个输出文件

时间:2015-05-28 12:37:08

标签: python list io

我正在使用存储在大型文本文件中的数据集。对于我正在进行的分析,我打开文件,提取数据集的一部分并比较提取的子集。我的代码就像这样:

from math import ceil

with open("seqs.txt","rb") as f:
    f = f.readlines()

assert type(f) == list, "ERROR: file object not converted to list"

fives = int( ceil(0.05*len(f)) ) 
thirds = int( ceil(len(f)/3) )

## top/bottom 5% of dataset
low_5=f[0:fives]
top_5=f[-fives:]

## top/bottom 1/3 of dataset
low_33=f[0:thirds]
top_33=f[-thirds:]

## Write lists to file
# top-5
with open("high-5.out","w") as outfile1:
   for i in top_5:
       outfile1.write("%s" %i)
# low-5
with open("low-5.out","w") as outfile2:
    for i in low_5:
        outfile2.write("%s" %i)
# top-33
with open("high-33.out","w") as outfile3:
    for i in top_33:
        outfile3.write("%s" %i)
# low-33        
with open("low-33.out","w") as outfile4:
    for i in low_33:
        outfile4.write("%s" %i)

我正在尝试找到一种更聪明的方法来自动化将列表写入文件的过程。在这种情况下,只有四个,但在将来的情况下,我最终可能会有多达15-25个列表,我会有一些功能来处理这个问题。我写了以下内容:

def write_to_file(*args):
    for i in args:
        with open(".out", "w") as outfile:
            outfile.write("%s" %i)

但是当我调用函数时,结果文件只包含最终列表:

write_to_file(low_33,low_5,top_33,top_5)

我知道我必须为每个列表定义一个输出文件(我在上面的函数中没有这样做),我只是不确定如何实现它。有任何想法吗?

5 个答案:

答案 0 :(得分:1)

通过为每个参数递增计数器,每个参数可以有一个输出文件。例如:

def write_to_file(*args):
    for index, i in enumerate(args):
        with open("{}.out".format(index+1), "w") as outfile:
           outfile.write("%s" %i)

上面的示例将创建输出文件"1.out""2.out""3.out""4.out"

或者,如果您有想要使用的特定名称(如原始代码中所示),您可以执行以下操作:

def write_to_file(args):
    for name, data in args:
        with open("{}.out".format(name), "w") as outfile:
            outfile.write("%s" % data)

args = [('low-33', low_33), ('low-5', low_5), ('high-33', top_33), ('high-5', top_5)]
write_to_file(args)

将创建输出文件"low-33.out""low-5.out""high-33.out""high-5.out"

答案 1 :(得分:1)

使您的变量名与您的文件名匹配,然后使用字典来保存它们,而不是将它们保存在全局命名空间中:

data = {'high_5': # data
       ,'low_5': # data
       ,'high_33': # data
       ,'low_33': # data}

for key in data:
    with open('{}.out'.format(key), 'w') as output:
        for i in data[key]:
            output.write(i)

将您的数据保存在一个易于使用的地方,并假设您要对它们应用相同的操作,您可以继续使用相同的范例。

如下面的PM2Ring所述,建议使用下划线(就像在变量名中一样)而不是破折号(就像在文件名中那样),因为这样做可以将字典键作为关键字参数传递到写作功能:

write_to_file(**data)

这相当于:

write_to_file(low_5=f[:fives], high_5=f[-fives:],...) # and the rest of the data

由此您可以使用其他答案定义的功能之一。

答案 2 :(得分:1)

不要试图聪明。而是旨在让您的代码易读,易于理解。您可以将重复的代码分组到一个函数中,例如:

from math import ceil

def save_to_file(data, filename):
    with open(filename, 'wb') as f:
        for item in data:
            f.write('{}'.format(item))

with open('data.txt') as f:
    numbers = list(f)

five_percent = int(len(numbers) * 0.05)
thirty_three_percent = int(ceil(len(numbers) / 3.0))
# Why not: thirty_three_percent = int(len(numbers) * 0.33)
save_to_file(numbers[:five_percent], 'low-5.out')
save_to_file(numbers[-five_percent:], 'high-5.out')
save_to_file(numbers[:thirty_three_percent], 'low-33.out')
save_to_file(numbers[-thirty_three_percent:], 'high-33.out')

更新

如果要编写很多列表,那么使用循环是有意义的。我建议有两个功能:save_top_n_percentsave_low_n_percent来帮助完成这项工作。它们包含一些重复的代码,但通过将它们分成两个函数,它更清晰,更容易理解。

def save_to_file(data, filename):
    with open(filename, 'wb') as f:
        for item in data:
            f.write(item)

def save_top_n_percent(n, data):
    n_percent = int(len(data) * n / 100.0)
    save_to_file(data[-n_percent:], 'top-{}.out'.format(n))

def save_low_n_percent(n, data):
    n_percent = int(len(data) * n / 100.0)
    save_to_file(data[:n_percent], 'low-{}.out'.format(n))

with open('data.txt') as f:
    numbers = list(f)

for n_percent in [5, 33]:
    save_top_n_percent(n_percent, numbers)
    save_low_n_percent(n_percent, numbers)

答案 3 :(得分:0)

在这一行上,您每次都会打开一个名为 .out 的文件并写入。

with open(".out", "w") as outfile:

您需要为".out"中的每个i设置args唯一身份。您可以通过传入列表作为args来实现此目的,列表将包含文件名和数据。

def write_to_file(*args):
    for i in args:
        with open("%s.out" % i[0], "w") as outfile:
            outfile.write("%s" % i[1])

传递像这样的论据......

write_to_file(["low_33",low_33],["low_5",low_5],["top_33",top_33],["top_5",top_5])

答案 4 :(得分:0)

您正在创建一个名为“.out”的文件,并且每次都会覆盖它。

def write_to_file(*args):
    for i in args:
        filename = i + ".out"
        contents = globals()[i]
        with open(".out", "w") as outfile:
            outfile.write("%s" %contents)


write_to_file("low_33", "low_5", "top_33", "top_5")

https://stackoverflow.com/a/6504497/3583980(字符串中的变量名)

这将创建low_33.out,low_5.out,top_33.out,top_5.out,其内容将是存储在这些变量中的列表。