单行中的两个时间戳字符串可转换日期时间

时间:2019-09-30 05:31:50

标签: python pandas dataframe timestamp

csv在时间戳字符串类型下面包含一些列值,如何将其转换为熊猫中最近日期的日期时间...

2019-09-27 09:15:422019-09-28 14:55:182019-09-26 04:54:12 

case[date]=case[date].apply(lambda x: pd.to_datetime(x,errors = 'coerce',infer_datetime_format=True))

但执行时出现以下错误

('offset must be a timedelta strictly between -timedelta(hours=24) and timedelta(hours=24).', 'occurred at index Preauth Pending Date')

2 个答案:

答案 0 :(得分:0)

,之前将长度为4的整数加上-以便可能被,分割,转换为日期时间并得到max的值:

df = pd.DataFrame({'date':['2019-09-27 09:15:422019-09-28 14:55:182019-09-26 04:54:12',
                           '2018-09-27 09:15:422018-09-28 14:55:182020-09-26 04:54:12']})
#print (df)

f = lambda x: pd.to_datetime(x, errors = 'coerce',infer_datetime_format=True)
df['last'] = (df['date'].str.replace(r'(\d{4}-)', r',\1')
                        .str.split(',', expand=True)
                        .apply(f)
                        .max(axis=1))
print (df)
                                                date                last
0  2019-09-27 09:15:422019-09-28 14:55:182019-09-... 2019-09-28 14:55:18
1  2018-09-27 09:15:422018-09-28 14:55:182020-09-... 2020-09-26 04:54:12

编辑:

d = {'Preauth Pending Date': [nan, nan, nan, '2019-09-21 05:34:06', nan],
 'Preauth Pending Updated Date': [nan, nan, nan, '2019-09-23 10:29:05', nan],
 'Claim Pending Date': ['2019-09-26 15:51:492019-09-16 09:40:06', nan,'2019-09-24 11:59:33', nan, nan],
 'Claim Pending Updated Date': ['2019-09-27 09:06:122019-09-16 09:49:34', nan, '2019-09-25 09:13:45', nan, nan]}


df = pd.DataFrame(d)
#print (df)

for c in df.columns:
    f = lambda x: pd.to_datetime(x, errors = 'coerce',infer_datetime_format=True)
    df[c] = (df[c].str.replace(r'(\d{4}-)', r',\1')
                                .str.split(',', expand=True)
                                .apply(f)
                                 .max(axis=1))
print (df)

  Preauth Pending Date Preauth Pending Updated Date  Claim Pending Date  \
0                  NaT                          NaT 2019-09-26 15:51:49   
1                  NaT                          NaT                 NaT   
2                  NaT                          NaT 2019-09-24 11:59:33   
3  2019-09-21 05:34:06          2019-09-23 10:29:05                 NaT   
4                  NaT                          NaT                 NaT   

  Claim Pending Updated Date  
0        2019-09-27 09:06:12  
1                        NaT  
2        2019-09-25 09:13:45  
3                        NaT  
4                        NaT  

答案 1 :(得分:0)

尝试一下:

df = pd.DataFrame({'date':['2019-09-27 09:15:422019-09-28 14:55:182019-09-26 04:54:12']})
df['date']=df['date'].apply(lambda x : max([pd.to_datetime(x[i:i+19]) for i in range(0,len(x),19)]))
df['date']

我假设您的日期格式在选择最新日期之前不会更改为使用字符串的大小将其分割。