Question

我想将各种csv文件的列组合成一个csv文件，并带有一个新的标题，水平连接。我想只选择标题选择的某些列。每个要合并的文件中都有不同的列。

示例输入：

freestream.csv：

static pressure,static temperature,relative Mach number
1.01e5,288,5.00e-02

fan.csv：

static pressure,static temperature,mass flow
0.9e5,301,72.9

exhaust.csv：

static pressure,static temperature,mass flow
1.7e5,432,73.1

期望的输出：

combined.csv：

P_amb,M0,Ps_fan,W_fan,W_exh
1.01e5,5.00e-02,0.9e6,72.9,73.1

可能调用该函数：

reorder_multiple_CSVs(["freestream.csv","fan.csv","exhaust.csv"],
    "combined.csv",["static pressure,relative Mach number",
    "static pressure,mass flow","mass flow"],
    "P_amb,M0,Ps_fan,W_fan,W_exh")

这是代码的先前版本，只允许一个输入文件。我是在write CSV columns out in a different order in Python的帮助下写的：

def reorder_CSV(infilename,outfilename,oldheadings,newheadings):
    with open(infilename) as infile:
       with open(outfilename,'w') as outfile:
           reader = csv.reader(infile)
           writer = csv.writer(outfile)
           readnames = reader.next()
           name2index = dict((name,index) for index, name in enumerate(readnames))
           writeindices = [name2index[name] for name in oldheadings.split(",")]
           reorderfunc = operator.itemgetter(*writeindices)
           writer.writerow(newheadings.split(","))
           for row in reader:
               towrite = reorderfunc(row)
               if isinstance(towrite,str):
                   writer.writerow([towrite])
               else:
                   writer.writerow(towrite)

所以我想弄清楚，为了使其适应多个文件，我是：

- 我现在需要infilename，oldheadings和newheadings列表（所有长度相同）

- 我需要遍历输入文件列表以制作读者列表

-readnames也可以是一个列表，迭代读者

- 这意味着我可以将name2index设为字典列表

我不知道怎么做的一件事是使用关键字with，嵌套的n级深度，当n只在运行时知道。我读到了这个：How can I open multiple files using "with open" in Python?但这似乎只有在您知道需要打开多少文件时才有效。

或者有更好的方法吗？

我对python很新，所以我感谢你能给我的任何提示。

Answer 1

我只回复有关使用with打开多个文件的部分，其中之前的文件数量未知。编写自己的contextmanager，这样的事情（完全未经测试）不应该太难：

from contextlib import contextmanager

@contextmanager
def open_many_files(filenames):
    files = [open(filename) for filename in filenames]
    try:
        yield files
    finally:
        for f in files:
            f.close()

您可以这样使用：

innames = ['file1.csv', 'file2.csv', 'file3.csv']
outname = 'out.csv'
with open_many(innames) as infiles, open(outname, 'w') as outfile:
    for infile in infiles:
        do_stuff(in_file)

还有一个does something similar函数，但不推荐使用。

Answer 2

我不确定这是否是正确的方法，但我想扩展Bas Swinckels答案。他在非常有帮助的答案中有一些小的不一致之处，我想给出相关代码。

这就是我所做的，并且有效。

from contextlib import contextmanager
import csv
import operator
import itertools as IT

@contextmanager
def open_many_files(filenames):
    files=[open(filename,'r') for filename in filenames]
    try:
        yield files
    finally:
        for f in files:
            f.close()

def reorder_multiple_CSV(infilenames,outfilename,oldheadings,newheadings):
    with open_many_files(filter(None,infilenames.split(','))) as handles:
        with open(outfilename,'w') as outfile:
            readers=[csv.reader(f) for f in handles]
            writer = csv.writer(outfile)
            reorderfunc=[]
            for i, reader in enumerate(readers):
                readnames = reader.next()
                name2index = dict((name,index) for index, name in enumerate(readnames))
                writeindices = [name2index[name] for name in filter(None,oldheadings[i].split(","))]
                reorderfunc.append(operator.itemgetter(*writeindices))
            writer.writerow(filter(None,newheadings.split(",")))
            for rows in IT.izip_longest(*readers,fillvalue=['']*2):
                towrite=[]
                for i, row in enumerate(rows):
                   towrite.extend(reorderfunc[i](row))
                if isinstance(towrite,str):
                   writer.writerow([towrite])
                else:
                   writer.writerow(towrite)

嵌套在Python中的块，嵌套变量的级别

2 个答案: