从文件中选择列并写入文件名为列名的新文件

时间:2015-12-14 11:43:06

标签: python pandas glob

我的数据集格式如下:

    Theta   DeltaD  DeltaS  Lambda  Rho LogLik
1   0.0137060718    0.0378903969    0.4939959667    0.3795642767    0.57232859  -963.7743175455
2   0.0137060718    0.0378903969    0.4951036519    0.3795642767    0.57232859  -963.745770314
3   0.0136703063    0.038522257 0.4807565701    0.3551424944    0.5639182313    -964.5802838333
4   0.0136703063    0.0382752067    0.4597773216    0.3551424944    0.5621381788    -963.0634821126
5   0.0136703063    0.0377739624    0.4597773216    0.3486538546    0.5552092482    -963.315982188
6   0.0136119461    0.0359108581    0.4597773216    0.3486538546    0.5552092482    -963.5321138251
7   0.0136119461    0.0374395068    0.4597773216    0.3582883699    0.5862608093    -963.3432259866
8   0.0136119461    0.0374395068    0.4597773216    0.3582883699    0.5862608093    -963.3432259866
9   0.0136119461    0.0383243243    0.4597773216    0.3582883699    0.5862608093    -963.288725532
10  0.0136119461    0.0383243243    0.467850463 0.3582883699    0.5862608093    -963.058588502

我想从每个文件中选择列DeltaS,并将输出保存为csv或任何分隔格式,但文件名为列名。

我提出了如下代码:

import glob
import numpy
import pandas as pd
import csv

outfile = open("final_DeltaS",'w')

list_of_files = []
for name in glob.glob('*iter.csv'):
    list_of_files.append(name)


def fileinput(files):
    for f in files:
        df = pd.read_csv(f)
        print f, df["DeltaS"]

fileinput(list_of_files)

但是关于如何从这个循环输出数据的堆栈:x 预期产出:

File_1  File_2
0.0378903969    0.4939959667
0.0378903969    0.4951036519
0.038522257 0.4807565701
0.0382752067    0.4597773216
0.0377739624    0.4597773216
0.0359108581    0.4597773216
0.0374395068    0.4597773216
0.0374395068    0.4597773216
0.0383243243    0.4597773216
0.0383243243    0.467850463

1 个答案:

答案 0 :(得分:0)

IIUC以下应该有效:

df_col_list = []
def fileinput(files):
    for f in files:
        df = pd.read_csv(f, usecols=['DeltaS'])
        df.rename(columns={'DeltaS':f}, inplace=True)
        df_col_list.append(df)

concat = pd.concat(df_col_list, axis = 1)
concat.to_csv(your_output_path)

您可能需要将文件名删除到您真正想要的内容,但这很简单