嵌套在Python中的块,嵌套变量的级别

时间:2014-07-01 14:28:19

标签: python csv

我想将各种csv文件的列组合成一个csv文件,并带有一个新的标题,水平连接。我想只选择标题选择的某些列。每个要合并的文件中都有不同的列。

示例输入:

freestream.csv:

static pressure,static temperature,relative Mach number
1.01e5,288,5.00e-02

fan.csv:

static pressure,static temperature,mass flow
0.9e5,301,72.9

exhaust.csv:

static pressure,static temperature,mass flow
1.7e5,432,73.1

期望的输出:

combined.csv:

P_amb,M0,Ps_fan,W_fan,W_exh
1.01e5,5.00e-02,0.9e6,72.9,73.1

可能调用该函数:

reorder_multiple_CSVs(["freestream.csv","fan.csv","exhaust.csv"],
    "combined.csv",["static pressure,relative Mach number",
    "static pressure,mass flow","mass flow"],
    "P_amb,M0,Ps_fan,W_fan,W_exh")

这是代码的先前版本,只允许一个输入文件。我是在write CSV columns out in a different order in Python的帮助下写的:

def reorder_CSV(infilename,outfilename,oldheadings,newheadings):
    with open(infilename) as infile:
       with open(outfilename,'w') as outfile:
           reader = csv.reader(infile)
           writer = csv.writer(outfile)
           readnames = reader.next()
           name2index = dict((name,index) for index, name in enumerate(readnames))
           writeindices = [name2index[name] for name in oldheadings.split(",")]
           reorderfunc = operator.itemgetter(*writeindices)
           writer.writerow(newheadings.split(","))
           for row in reader:
               towrite = reorderfunc(row)
               if isinstance(towrite,str):
                   writer.writerow([towrite])
               else:
                   writer.writerow(towrite)

所以我想弄清楚,为了使其适应多个文件,我是:

- 我现在需要infilename,oldheadings和newheadings列表(所有长度相同)

- 我需要遍历输入文件列表以制作读者列表

-readnames也可以是一个列表,迭代读者

- 这意味着我可以将name2index设为字典列表

我不知道怎么做的一件事是使用关键字with,嵌套的n级深度,当n只在运行时知道。我读到了这个:How can I open multiple files using "with open" in Python?但这似乎只有在您知道需要打开多少文件时才有效。

或者有更好的方法吗?

我对python很新,所以我感谢你能给我的任何提示。

2 个答案:

答案 0 :(得分:2)

我只回复有关使用with打开多个文件的部分,其中之前的文件数量未知。编写自己的contextmanager,这样的事情(完全未经测试)不应该太难:

from contextlib import contextmanager

@contextmanager
def open_many_files(filenames):
    files = [open(filename) for filename in filenames]
    try:
        yield files
    finally:
        for f in files:
            f.close()

您可以这样使用:

innames = ['file1.csv', 'file2.csv', 'file3.csv']
outname = 'out.csv'
with open_many(innames) as infiles, open(outname, 'w') as outfile:
    for infile in infiles:
        do_stuff(in_file)

还有一个does something similar函数,但不推荐使用。

答案 1 :(得分:0)

我不确定这是否是正确的方法,但我想扩展Bas Swinckels答案。他在非常有帮助的答案中有一些小的不一致之处,我想给出相关代码。

这就是我所做的,并且有效。

from contextlib import contextmanager
import csv
import operator
import itertools as IT

@contextmanager
def open_many_files(filenames):
    files=[open(filename,'r') for filename in filenames]
    try:
        yield files
    finally:
        for f in files:
            f.close()

def reorder_multiple_CSV(infilenames,outfilename,oldheadings,newheadings):
    with open_many_files(filter(None,infilenames.split(','))) as handles:
        with open(outfilename,'w') as outfile:
            readers=[csv.reader(f) for f in handles]
            writer = csv.writer(outfile)
            reorderfunc=[]
            for i, reader in enumerate(readers):
                readnames = reader.next()
                name2index = dict((name,index) for index, name in enumerate(readnames))
                writeindices = [name2index[name] for name in filter(None,oldheadings[i].split(","))]
                reorderfunc.append(operator.itemgetter(*writeindices))
            writer.writerow(filter(None,newheadings.split(",")))
            for rows in IT.izip_longest(*readers,fillvalue=['']*2):
                towrite=[]
                for i, row in enumerate(rows):
                   towrite.extend(reorderfunc[i](row))
                if isinstance(towrite,str):
                   writer.writerow([towrite])
                else:
                   writer.writerow(towrite)