我的程序抓取满足目录中某些规范的文件,读入标签,将列缩小到指定的列,然后创建一个汇总文件的csv。我想让它读入轴的标签,将它们打印在文件的第一行,然后在每个后续行上打印有问题的轴的方法。
我得到的错误是没有这样的轴,当我用Google搜索错误时,似乎人们在向数据帧添加列时会收到错误。我只是要求现有的。
file_list = [f for f in os.listdir(pathname) if f.lower().endswith('.xls') and not 'map' in f.lower() and not 'check' in f.lower()]
temp_df = reduce_df(read_in(file_list[0]))
labels = temp_df.columns
print labels # I printed this to the screen for troubleshooting purposes
with open(summary_file, 'wb') as outfile:
writer = csv.writer(outfile)
writer.writerow(labels)
for f in file_list:
temp_df = reduce_df(read_in(f))
print temp_df.columns#ditto. this should match the previous one
new_row = [temp_df.mean(col) for col in temp_df.columns]
writer.writerow(new_row)
-output-
Using Python parser to sniff delimiter
Index([u'Product Mass Flow (kg/hr)', u'TC 03 (C)', u'MKS NO ppm (-)', u'MKS NO2 ppm (-)', u'MKS NH3 ppm (-)', u'MKS N2O ppm (-)', u'MKS H2O (%)', u'NOx Calc MKS (-)', u'MKS2 NO ppm (-)', u'MKS2 NO2 ppm (-)', u'MKS2 NH3 ppm (-)', u'MKS2 N2O ppm (-)', u'MKS2 H2O (%)', u'NOx Calc MKS2 (-)'], dtype='object')
Using Python parser to sniff delimiter
Index([u'Product Mass Flow (kg/hr)', u'TC 03 (C)', u'MKS NO ppm (-)', u'MKS NO2 ppm (-)', u'MKS NH3 ppm (-)', u'MKS N2O ppm (-)', u'MKS H2O (%)', u'NOx Calc MKS (-)', u'MKS2 NO ppm (-)', u'MKS2 NO2 ppm (-)', u'MKS2 NH3 ppm (-)', u'MKS2 N2O ppm (-)', u'MKS2 H2O (%)', u'NOx Calc MKS2 (-)'], dtype='object')
Traceback (most recent call last):
File "(program trace here)", line 50, in <module>
new_row = [temp_df.mean(col) for col in temp_df.columns]
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 3490, in stat_func
skipna=skipna, numeric_only=numeric_only)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 3961, in _reduce
axis = self._get_axis_number(axis)
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 295, in _get_axis_number
.format(axis, type(self)))
ValueError: No axis named Product Mass Flow (kg/hr) for object type <class 'pandas.core.frame.DataFrame'>
根据要求,我的defs - 数据记录器里面有很多东西,它们有令人困惑的名字,或根本就没有必要。我正在减少我想要总结的内容:
def read_in(filename):
current_df = pd.read_csv(filename, sep = None, skiprows = range(11))
return current_df
def reduce_df(temp_df):
columns = temp_df.columns
columns = [column for column in columns if 'mks' in column.lower() or 'mass flow' in column.lower() or 'tc 03' in column.lower()]
return temp_df[columns]
答案 0 :(得分:0)
我最终搞清楚了(我会调查describe和to_csv),但这里有什么目前的工作:
with open(summary_file, 'wb') as outfile:
writer = csv.writer(outfile)
writer.writerow(labels)
for f in file_list:
temp_df = reduce_df(read_in(f))
new_row = [value for value in temp_df.mean()] #if you don't do the value for value part, you end up with the column and value for each all in one cell
new_row.insert(0, f) #I added this to include the filename in the summary file for the row the stuff in that row applied to
writer.writerow(new_row)