更新

Question

我正在使用存储在大型文本文件中的数据集。对于我正在进行的分析，我打开文件，提取数据集的一部分并比较提取的子集。我的代码就像这样：

from math import ceil

with open("seqs.txt","rb") as f:
    f = f.readlines()

assert type(f) == list, "ERROR: file object not converted to list"

fives = int( ceil(0.05*len(f)) ) 
thirds = int( ceil(len(f)/3) )

## top/bottom 5% of dataset
low_5=f[0:fives]
top_5=f[-fives:]

## top/bottom 1/3 of dataset
low_33=f[0:thirds]
top_33=f[-thirds:]

## Write lists to file
# top-5
with open("high-5.out","w") as outfile1:
   for i in top_5:
       outfile1.write("%s" %i)
# low-5
with open("low-5.out","w") as outfile2:
    for i in low_5:
        outfile2.write("%s" %i)
# top-33
with open("high-33.out","w") as outfile3:
    for i in top_33:
        outfile3.write("%s" %i)
# low-33        
with open("low-33.out","w") as outfile4:
    for i in low_33:
        outfile4.write("%s" %i)

我正在尝试找到一种更聪明的方法来自动化将列表写入文件的过程。在这种情况下，只有四个，但在将来的情况下，我最终可能会有多达15-25个列表，我会有一些功能来处理这个问题。我写了以下内容：

def write_to_file(*args):
    for i in args:
        with open(".out", "w") as outfile:
            outfile.write("%s" %i)

但是当我调用函数时，结果文件只包含最终列表：

write_to_file(low_33,low_5,top_33,top_5)

我知道我必须为每个列表定义一个输出文件（我在上面的函数中没有这样做），我只是不确定如何实现它。有任何想法吗？

Answer 1

通过为每个参数递增计数器，每个参数可以有一个输出文件。例如：

def write_to_file(*args):
    for index, i in enumerate(args):
        with open("{}.out".format(index+1), "w") as outfile:
           outfile.write("%s" %i)

上面的示例将创建输出文件"1.out"，"2.out"，"3.out"和"4.out"。

或者，如果您有想要使用的特定名称（如原始代码中所示），您可以执行以下操作：

def write_to_file(args):
    for name, data in args:
        with open("{}.out".format(name), "w") as outfile:
            outfile.write("%s" % data)

args = [('low-33', low_33), ('low-5', low_5), ('high-33', top_33), ('high-5', top_5)]
write_to_file(args)

将创建输出文件"low-33.out"，"low-5.out"，"high-33.out"和"high-5.out"。

Answer 2

使您的变量名与您的文件名匹配，然后使用字典来保存它们，而不是将它们保存在全局命名空间中：

data = {'high_5': # data
       ,'low_5': # data
       ,'high_33': # data
       ,'low_33': # data}

for key in data:
    with open('{}.out'.format(key), 'w') as output:
        for i in data[key]:
            output.write(i)

将您的数据保存在一个易于使用的地方，并假设您要对它们应用相同的操作，您可以继续使用相同的范例。

如下面的PM2Ring所述，建议使用下划线（就像在变量名中一样）而不是破折号（就像在文件名中那样），因为这样做可以将字典键作为关键字参数传递到写作功能：

write_to_file(**data)

这相当于：

write_to_file(low_5=f[:fives], high_5=f[-fives:],...) # and the rest of the data

由此您可以使用其他答案定义的功能之一。

Answer 3

不要试图聪明。而是旨在让您的代码易读，易于理解。您可以将重复的代码分组到一个函数中，例如：

from math import ceil

def save_to_file(data, filename):
    with open(filename, 'wb') as f:
        for item in data:
            f.write('{}'.format(item))

with open('data.txt') as f:
    numbers = list(f)

five_percent = int(len(numbers) * 0.05)
thirty_three_percent = int(ceil(len(numbers) / 3.0))
# Why not: thirty_three_percent = int(len(numbers) * 0.33)
save_to_file(numbers[:five_percent], 'low-5.out')
save_to_file(numbers[-five_percent:], 'high-5.out')
save_to_file(numbers[:thirty_three_percent], 'low-33.out')
save_to_file(numbers[-thirty_three_percent:], 'high-33.out')

更新

如果要编写很多列表，那么使用循环是有意义的。我建议有两个功能：save_top_n_percent和save_low_n_percent来帮助完成这项工作。它们包含一些重复的代码，但通过将它们分成两个函数，它更清晰，更容易理解。

def save_to_file(data, filename):
    with open(filename, 'wb') as f:
        for item in data:
            f.write(item)

def save_top_n_percent(n, data):
    n_percent = int(len(data) * n / 100.0)
    save_to_file(data[-n_percent:], 'top-{}.out'.format(n))

def save_low_n_percent(n, data):
    n_percent = int(len(data) * n / 100.0)
    save_to_file(data[:n_percent], 'low-{}.out'.format(n))

with open('data.txt') as f:
    numbers = list(f)

for n_percent in [5, 33]:
    save_top_n_percent(n_percent, numbers)
    save_low_n_percent(n_percent, numbers)

Answer 4

在这一行上，您每次都会打开一个名为 .out 的文件并写入。

with open(".out", "w") as outfile:

您需要为".out"中的每个i设置args唯一身份。您可以通过传入列表作为args来实现此目的，列表将包含文件名和数据。

def write_to_file(*args):
    for i in args:
        with open("%s.out" % i[0], "w") as outfile:
            outfile.write("%s" % i[1])

传递像这样的论据......

write_to_file(["low_33",low_33],["low_5",low_5],["top_33",top_33],["top_5",top_5])

Answer 5

您正在创建一个名为“.out”的文件，并且每次都会覆盖它。

def write_to_file(*args):
    for i in args:
        filename = i + ".out"
        contents = globals()[i]
        with open(".out", "w") as outfile:
            outfile.write("%s" %contents)


write_to_file("low_33", "low_5", "top_33", "top_5")

https://stackoverflow.com/a/6504497/3583980（字符串中的变量名）

这将创建low_33.out，low_5.out，top_33.out，top_5.out，其内容将是存储在这些变量中的列表。

将多个列表写入多个输出文件

5 个答案:

更新