如果header = 0,熊猫的read_csv函数慢3倍?

时间:2019-03-19 15:47:13

标签: python pandas

我正在导入具有熊猫read_csv函数的100Mb CSV文件(950630行)。我意识到,如果将header设置为None,它将快3倍。知道为什么吗?

import pandas as pd
import time

# job 1
start=time.time()
df=pd.read_csv("data.txt",sep=',', engine='c', header=None, na_filter=False, low_memory=False)
df.columns = df.iloc[0]
df=df.drop(df.index[0])
print("Job 1 took:",time.time()-start,"sec")
print(df.index.argmax())

# job 2
start2=time.time()
df2=pd.read_csv("data.txt",sep=',', engine='c', header=0,na_filter=False, low_memory=False)
print("Job 2 took:",time.time()-start2,"sec")
print(df2.index.argmax())

作业1:1.7068684101104736秒

工作2花费了:5.8732721090093994秒

0 个答案:

没有答案