我的数据集格式如下:
Theta DeltaD DeltaS Lambda Rho LogLik
1 0.0137060718 0.0378903969 0.4939959667 0.3795642767 0.57232859 -963.7743175455
2 0.0137060718 0.0378903969 0.4951036519 0.3795642767 0.57232859 -963.745770314
3 0.0136703063 0.038522257 0.4807565701 0.3551424944 0.5639182313 -964.5802838333
4 0.0136703063 0.0382752067 0.4597773216 0.3551424944 0.5621381788 -963.0634821126
5 0.0136703063 0.0377739624 0.4597773216 0.3486538546 0.5552092482 -963.315982188
6 0.0136119461 0.0359108581 0.4597773216 0.3486538546 0.5552092482 -963.5321138251
7 0.0136119461 0.0374395068 0.4597773216 0.3582883699 0.5862608093 -963.3432259866
8 0.0136119461 0.0374395068 0.4597773216 0.3582883699 0.5862608093 -963.3432259866
9 0.0136119461 0.0383243243 0.4597773216 0.3582883699 0.5862608093 -963.288725532
10 0.0136119461 0.0383243243 0.467850463 0.3582883699 0.5862608093 -963.058588502
我想从每个文件中选择列DeltaS,并将输出保存为csv或任何分隔格式,但文件名为列名。
我提出了如下代码:
import glob
import numpy
import pandas as pd
import csv
outfile = open("final_DeltaS",'w')
list_of_files = []
for name in glob.glob('*iter.csv'):
list_of_files.append(name)
def fileinput(files):
for f in files:
df = pd.read_csv(f)
print f, df["DeltaS"]
fileinput(list_of_files)
但是关于如何从这个循环输出数据的堆栈:x 预期产出:
File_1 File_2
0.0378903969 0.4939959667
0.0378903969 0.4951036519
0.038522257 0.4807565701
0.0382752067 0.4597773216
0.0377739624 0.4597773216
0.0359108581 0.4597773216
0.0374395068 0.4597773216
0.0374395068 0.4597773216
0.0383243243 0.4597773216
0.0383243243 0.467850463
答案 0 :(得分:0)
IIUC以下应该有效:
df_col_list = []
def fileinput(files):
for f in files:
df = pd.read_csv(f, usecols=['DeltaS'])
df.rename(columns={'DeltaS':f}, inplace=True)
df_col_list.append(df)
concat = pd.concat(df_col_list, axis = 1)
concat.to_csv(your_output_path)
您可能需要将文件名删除到您真正想要的内容,但这很简单