我正在从多个CSV文件中读取数据。在下面的示例中,我们只使用一个文件。 CSV包含两列数据,'EFT'和'CTR'。在读取数据后,我正在尝试使用CTR值计算积分值,并将这些计算的积分值存储在名为ITCER的新列中。如果我正在分析大约10,000个数据点,整个过程大约需要15秒。理想情况下,我将分析来自8个不同文件的25,000个数据点。因此,它确实需要一些时间。有没有人有任何关于如何改变代码来计算ITCER(集成部分)以使其运行更快的建议。
CODE:
import pandas as pd
from pandas import *
#### User input; run numbers - will be fed into the program as part of the file name..
files = [20135501]
#create dataframes to store data temporarily based on EFTs
ITCER = DataFrame()
b = []
#### process data path
shorts = ['Python/CSV/Run Files/' + str(i) + '.csv' for i in files]
#### reads process data
ferms = [pd.read_csv(s) for s in shorts]
csvDFs = [(ferm).apply(pd.Series.interpolate) for ferm in ferms]
for i in range(len(files)):
#Calculations
for j in range(len(ferms[i])):
if j == 0:
c = ferms[i].irow(j)['CTR [mM/h]']
if j > 0:
g = ferms[i].irow(j)['CTR [mM/h]']
h = ferms[i].irow(j-1)['CTR [mM/h]']
c = c + (g+h)/120000
b.append(c)
itcer = ITCER.append(b)
eft = csvDFs[i]['EFT (h)']
#insert calculations
csvDFs[i]['EFT'] = eft
csvDFs[i]['ITCER'] = itcer
#keep specific columns and then save file in folder
csvDFs[i] = csvDFs[i][['EFT','ITCER']]
csvDFs[i].to_csv('Python/CSV/Test/'+ str(files[i])+'.csv', index = False)