我正在尝试从三个父(或源)数据框(每个从.csv文件创建)创建一个数据框,但是当将结果数据框写入文件或在屏幕上打印时,名为“index”的列将显示出来。如何抑制/删除它们?
三个“父”数据框:
df1 ...
fname lname employer score1 score2 score3
0 Alice Adams IMB -1.0 2.5 -0.2
1 Alice Brown MFS 2.2 -7.9 3.7
2 Alice Curt OCR 2.6 -1.2 -0.7
df2 ...
fname lname employer score1 score2 score3
0 Alice Adams IMB 3.0 0.1 -2.9
1 Alice Brown MFS -2.1 2.6 -1.0
2 Alice Curt OCR 3.1 1.9 -0.1
df3 ...
fname lname employer score1 score2 score3
0 Alice Adams IMB -1.0 -2.1 0.1
1 Alice Brown MFS 3.2 -0.9 5.1
2 Alice Curt OCR -1.1 -1.2 -1.9
经过一系列的操作,我得到了这个:
fname lname index employer score1 index employer score3 index employer score1 index employer score3 index employer score1 index employer score3
0 Alice Adams 0 IMB -1.0 2 OCR -0.7 1 MFS -2.1 0 IMB -2.9 2 OCR -1.1 2 OCR -1.9
1 Alice Brown 1 MFS 2.2 0 IMB -0.2 0 IMB 3.0 1 MFS -1.0 0 IMB -1.0 0 IMB 0.1
2 Alice Curt 2 OCR 2.6 1 MFS 3.7 2 OCR 3.1 2 OCR -0.1 1 MFS 3.2 1 MFS 5.1
我在寻找什么:
删除名为“index”的列。
我有一个MWE,我粘贴了上面的结果。如果您想在此处查看源.csv和.py文件,请与我们联系。
附录
发布源.csv文件和.py脚本:
A.csv ...
fname,lname,employer,score1,score2,score3
Alice,Adams,IMB,-1.0,2.5,-0.2
Alice,Brown,MFS,2.2,-7.9,3.7
Alice,Curt,OCR,2.6,-1.2,-0.7
B.csv ...
fname,lname,employer,score1,score2,score3
Alice,Adams,IMB,3.0,0.1,-2.9
Alice,Brown,MFS,-2.1,2.6,-1.0
Alice,Curt,OCR,3.1,1.9,-0.1
C.csv ...
fname,lname,employer,score1,score2,score3
Alice,Adams,IMB,-1.0,-2.1,0.1
Alice,Brown,MFS,3.2,-0.9,5.1
Alice,Curt,OCR,-1.1,-1.2,-1.9
现在,.py脚本......
# -*- coding: utf-8 -*-
import fnmatch
import os
import matplotlib.pyplot as plt
import pandas as pd
pd.set_option('display.max_columns', None)
Datasets = ['A', 'B', 'C']
bigDF = pd.DataFrame()
for fname in Datasets:
if fname == 'A':
csvdf = pd.read_csv(fname+'.csv')
csvdfBUa = csvdf[['fname', 'lname']]
csvdfBUb = csvdf[['employer', 'score1']]
csvdfBUb = csvdfBUb.sort(['score1'], ascending=[1])
csvdfBUb = csvdfBUb.reset_index()
csvdfBUc = csvdf[['employer', 'score3']]
csvdfBUc = csvdfBUc.sort(['score3'], ascending=[1])
csvdfBUc = csvdfBUc.reset_index()
csvdfBU = pd.concat([csvdfBUa, csvdfBUb, csvdfBUc], axis=1, ignore_index=False)
print csvdf
if len(bigDF.index) < 1:
bigDF = csvdfBU
else:
bigDF = pd.concat([bigDF, csvdfBU], axis=1, ignore_index=False)
elif fname == 'B':
csvdf = pd.read_csv(fname+'.csv')
csvdfAFb = csvdf[['employer', 'score1']]
csvdfAFb = csvdfAFb.sort(['score1'], ascending=[1])
csvdfAFb = csvdfAFb.reset_index()
csvdfAFc = csvdf[['employer', 'score3']]
csvdfAFc = csvdfAFc.sort(['score3'], ascending=[1])
csvdfAFc = csvdfAFc.reset_index()
csvdfAF = pd.concat([csvdfAFb, csvdfAFc], axis=1, ignore_index=False)
print csvdf
if len(bigDF.index) < 1:
bigDF = csvdfAF
else:
bigDF = pd.concat([bigDF, csvdfAF], axis=1, ignore_index=False)
elif fname == 'C':
csvdf = pd.read_csv(fname+'.csv')
csvdfGAb = csvdf[['employer', 'score1']]
csvdfGAb = csvdfGAb.sort(['score1'], ascending=[1])
csvdfGAb = csvdfGAb.reset_index()
csvdfGAc = csvdf[['employer', 'score3']]
csvdfGAc = csvdfGAc.sort(['score3'], ascending=[1])
csvdfGAc = csvdfGAc.reset_index()
csvdfGA = pd.concat([csvdfGAb, csvdfGAc], axis=1, ignore_index=False)
print csvdf
if len(bigDF.index) < 1:
bigDF = csvdfGA
else:
bigDF = pd.concat([bigDF, csvdfGA], axis=1, ignore_index=False)
print bigDF
答案 0 :(得分:1)
您可以使用以下方法删除“索引”列:
del df['index']
注意:我怀疑你可以在第一时间避免这种情况......
答案 1 :(得分:0)
import numpy as np
import pandas as pd
df[np.negative(pd.Series(df.columns).str.contains('index'))]