Question

我有一个包含多个csv文件的文件夹，其中包含库存数据。我想将所有文件读入数据帧并删除我不需要的数据，然后将其余文件合并到一个数据帧中。我写了一些有用的代码，但它做得不好，有一些我想跳过的中间步骤使它更有效。这是我现在使用的代码。

import pandas as pd
import os
import glob


my_dir = "Path where the csv files is stored" #Directory containing the     InFront data files

my_dir2 = "path where I save the csv's after droping some columns, and the   final file." # 

#Reads inn the csv file names into a list
filelist = []
os.chdir(my_dir)

for files in glob.glob("*.csv"):
    filelist.append(files)

#Open csv files from list and droppes everything but close price from the data frames
for string in filelist:
    path = "%s\\%s" % (my_dir, string)
    frame = pd.read_csv(path, index_col=[1], parse_dates=True)

    frame = frame.drop('<TIME>', 1)
    frame = frame.drop('<OPEN>', 1)
    frame = frame.drop('<HIGH>', 1)
    frame = frame.drop('<LOW>', 1)
    frame = frame.drop('<VOL>', 1)
    frame.index.names = ['Date']
    ticker = frame['<TICKER>'].ix[1]
    frame.rename(columns = {'<CLOSE>' : ticker}, inplace=True)
    frame.drop(frame.columns[0], 1, inplace=True)
    frame.sort_index(ascending=False, inplace=True)

#Saves the files to the folder specified as my_dir2
    frame.to_csv('%s\\new %s' % (my_dir2, string))   


filelist = []
os.chdir(my_dir2)

for files in glob.glob("*.csv"):
    filelist.append(files)

df_list = [pd.read_csv(file, index_col='Date', parse_dates=True) for file in   filelist]


big_df = pd.concat(df_list, axis=1)
big_df.sort_index(ascending=False, inplace=True)
big_df.to_csv('data.csv')

正如您所看到的，我已经分两步完成了这项工作，我还需要保存第一个结果。我必须有一个简单的方法来做到这一点，我希望有人可以帮助我。

将csv读入数据帧，更改帧的部分，并将所有内容附加到单个帧

0 个答案: