我正在使用pandas .astype()
将列名的字典转换为正确的dtype。它适用于str
,int
,datetime64[ns]
和float
,但在timedelta64[ns]
上失败。运行此命令时,出现 ValueError:无法将对象转换为NumPy timedelta 。
import pandas as pd
import numpy as np
sample_row = pd.DataFrame([['g1',
3912841,
'2018-09-29 16:03:49',
4.040196e+09,
'1 days 15:49:38']],
columns=['group',
'job_number',
'submission_time',
'maxvmem',
'wait_time'])
sample_row = (sample_row.astype(dtype={'group':'str',
'job_number':'int',
'submission_time':'datetime64[ns]',
'maxvmem':'float',
'wait_time':'timedelta64[ns]'}))
我发现了this answer to a similar question,但似乎表明我使用的是正确的dtype格式。
更新:以下是与@hpaulj建议更改相同的代码:
import pandas as pd
import numpy as np
sample_row = pd.DataFrame([['g1',
3912841,
'2018-09-29 16:03:49',
4.040196e+09,
pd.Timedelta('1 days 15:49:38')]],
columns=['group',
'job_number',
'submission_time',
'maxvmem',
'wait_time'])
sample_row = (sample_row.astype(dtype={'group':'str',
'job_number':'int',
'submission_time':'datetime64[ns]',
'maxvmem':'float',
'wait_time':'timedelta64[ns]'}))
要确认dtypes设置正确,请执行以下操作:
for i in sample_row.loc[0, sample_row.columns]:
print(type(i))
输出:
<class 'str'>
<class 'numpy.int32'>
<class 'pandas._libs.tslib.Timestamp'>
<class 'numpy.float64'>
<class 'pandas._libs.tslib.Timedelta'>