如何使timedelta64 [ns]与pandas astype()一起使用以将多列转换为不同的dtypes

时间:2019-02-06 18:05:00

标签: python pandas numpy timedelta

我正在使用pandas .astype()将列名的字典转换为正确的dtype。它适用于strintdatetime64[ns]float,但在timedelta64[ns]上失败。运行此命令时,出现 ValueError:无法将对象转换为NumPy timedelta

import pandas as pd
import numpy as np

sample_row = pd.DataFrame([['g1', 
                            3912841, 
                            '2018-09-29 16:03:49', 
                            4.040196e+09, 
                            '1 days 15:49:38']], 
                          columns=['group',
                                   'job_number', 
                                   'submission_time', 
                                   'maxvmem', 
                                   'wait_time'])

sample_row = (sample_row.astype(dtype={'group':'str', 
                                       'job_number':'int', 
                                       'submission_time':'datetime64[ns]', 
                                       'maxvmem':'float', 
                                       'wait_time':'timedelta64[ns]'}))

我发现了this answer to a similar question,但似乎表明我使用的是正确的dtype格式。


更新:以下是与@hpaulj建议更改相同的代码:

import pandas as pd
import numpy as np

sample_row = pd.DataFrame([['g1', 
                            3912841, 
                            '2018-09-29 16:03:49', 
                            4.040196e+09, 
                            pd.Timedelta('1 days 15:49:38')]],
                          columns=['group',
                                   'job_number', 
                                   'submission_time', 
                                   'maxvmem', 
                                   'wait_time'])

sample_row = (sample_row.astype(dtype={'group':'str', 
                                       'job_number':'int', 
                                       'submission_time':'datetime64[ns]', 
                                       'maxvmem':'float', 
                                       'wait_time':'timedelta64[ns]'}))

要确认dtypes设置正确,请执行以下操作:

for i in sample_row.loc[0, sample_row.columns]:
    print(type(i))

输出:

<class 'str'>
<class 'numpy.int32'>
<class 'pandas._libs.tslib.Timestamp'>
<class 'numpy.float64'>
<class 'pandas._libs.tslib.Timedelta'>

0 个答案:

没有答案