如果我有一个像这样的文档,其列名在第1行和第2行中重复,并且参数的单位在第3行中,那么我该如何调用pd.read_csv以便它创建一个数据帧,其标题具有列名称,单位和值?
Time Speed Torque
time speed torque
seconds m/s Nm
1 4000 229,5
2 4000 228,7
3 4000 230,1
答案 0 :(得分:1)
如果要在列中使用MultiIndex
,请使用参数header=[0,1]
来转换第一行和第二行,而不跳过行:
import pandas as pd
temp=u"""Time Speed Torque
time speed torque
seconds m/s Nm
1 4000 229,5
2 4000 228,7
3 4000 230,1"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), sep="\s+", header=[0,1], skiprows=[0])
print (df)
time speed torque
seconds m/s Nm
0 1 4000 229,5
1 2 4000 228,7
2 3 4000 230,1
print (df.columns)
MultiIndex(levels=[['speed', 'time', 'torque'], ['Nm', 'm/s', 'seconds']],
labels=[[1, 0, 2], [2, 1, 0]])
import pandas as pd
temp=u"""Time Speed Torque
time speed torque
seconds m/s Nm
1 4000 229,5
2 4000 228,7
3 4000 230,1"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), sep="\s+", header=[0,1], skiprows=[1])
print (df)
Time Speed Torque
seconds m/s Nm
0 1 4000 229,5
1 2 4000 228,7
2 3 4000 230,1
如果要省略第二行和第三行,请仅使用skiprows
参数:
import pandas as pd
temp=u"""Time Speed Torque
time speed torque
seconds m/s Nm
1 4000 229,5
2 4000 228,7
3 4000 230,1"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), sep="\s+", skiprows=[1, 2])
print (df)
Time Speed Torque
0 1 4000 229,5
1 2 4000 228,7
2 3 4000 230,1