我想制作第3行列索引
答案 0 :(得分:2)
快速简单的答案是
df.T.set_index(3).T
答案 1 :(得分:1)
df = pd.DataFrame({'A':['Groups'], 'B':['Quantity'], 'C':['Net Sales']}, index=[3])
df.columns = df.loc[3]
df = df.drop(3)
print (df)
Empty DataFrame
Columns: [Groups, Quantity, Net Sales]
Index: []
但更好的是避免它,例如如果使用read_csv
获取skiprows
,请使用参数DataFrame
,主要优势是read_csv
获取所有列的正确dtypes:
import pandas as pd
from pandas.compat import StringIO
temp=u"""A,B,C
D,E,F
G,H,I
J,K,L
Groups Quantity,Net,Sales
4,6,4"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp))
print (df)
A B C
0 D E F
1 G H I
2 J K L
3 Groups Quantity Net Sales
4 4 6 4
df = pd.read_csv(StringIO(temp), skiprows=4)
print (df)
Groups Quantity Net Sales
0 4 6 4
<强>计时强>:
In [319]: %timeit (df.T.set_index(3).T.reset_index(drop=True).astype(float).rename_axis(None, 1))
10 loops, best of 3: 43.1 ms per loop
In [320]: %timeit (jez(df))
10 loops, best of 3: 23.7 ms per loop
In [321]: %timeit (jez1(df))
100 loops, best of 3: 13.6 ms per loop
时间安排的代码:
此外,还添加了转换为float
到所有解决方案,如果所有数据都是字符串,那么就没有必要。
np.random.seed(100)
df = pd.DataFrame(np.random.random((100000,3)), columns=list('ABC'))
df = df.drop([0,1,2])
df.loc[3] = ['Groups', 'Quantity', 'Net Sales']
print (df)
print (df.T.set_index(3).T.reset_index(drop=True).astype(float).rename_axis(None, 1))
def jez(df):
df.columns = df.loc[3]
return df.drop(3).reset_index(drop=True).astype(float).rename_axis(None, 1)
def jez1(df):
arr = df.values
#get position (number of row) with 3
idx = df.index.get_loc(3)
return pd.DataFrame(np.delete(arr, (idx), axis=0).astype(float), columns=arr[idx])