货币/日期数据框合并失败

时间:2018-05-30 05:32:14

标签: python pandas dataframe merge

在基于xratedf

合并两个数据集currency_strcreated_date_time)时出现问题
display(xrate.info())

Int64Index: 1611 entries, 6 to 112
Data columns (total 3 columns):
Date        1611 non-null datetime64[ns]
PX_LAST     1611 non-null object
Currency    1611 non-null object

display(xrate.head(3))

Date       PX_LAST  Currency
2018-05-30  1       CAD
2018-05-29  1       CAD
2018-05-28  1       CAD

我创建了一个合并的新日期:

#df['formatted_created_date_time'] = df['created_date_time'].dt.strftime('%d%m%Y')
df['formatted_created_date_time'] = df['created_date_time'].dt.strftime('%d-%m-%Y')
#convert to date
#df['formatted_created_date_time'] = pd.to_datetime(df['formatted_created_date_time'], format='%d%m%Y')
df['formatted_created_date_time'] = pd.to_datetime(df['formatted_created_date_time'], format='%d-%m-%Y')


display(df.info())

RangeIndex: 3488 entries, 0 to 3487
Data columns (total 43 columns):
created_date_time              3488 non-null datetime64[ns]
rfq_create_date_time           3488 non-null datetime64[ns]
currency_str                   3488 non-null object

display(df.head(3))

dataframe image

现在合并了两个数据帧:

result = pd.merge(df, xrate, left_on=['currency_str', 'formatted_created_date_time'], right_on=['Currency', 'Date'], how='left')

display(result.info())

RangeIndex: 3488 entries, 0 to 3487
Data columns (total 43 columns):
created_date_time              3488 non-null datetime64[ns]
rfq_create_date_time           3488 non-null datetime64[ns]
.
.
formatted_created_date_time    3488 non-null datetime64[ns]

比赛失败了:

display(result.head(3))

enter image description here

系统日期时间:

enter image description here

关于这个的任何想法?

1 个答案:

答案 0 :(得分:1)

它应该很好用。

但另一种解决方案是string s:

合并
df['formatted_created_date_time'] = df['created_date_time'].dt.strftime('%d-%m-%Y')
xrate['Date'] = xrate['Date'].dt.strftime('%d-%m-%Y')

result = pd.merge(df, xrate, left_on=['currency_str', 'formatted_created_date_time'], 
                             right_on=['Currency', 'Date'], how='left')

您的解决方案应该通过floordate

进行简化
df['formatted_created_date_time'] = df['created_date_time'].dt.floor('d')
xrate['Date'] = xrate['Date'].dt.floor('d')

result = pd.merge(df, xrate, left_on=['currency_str', 'formatted_created_date_time'], 
                             right_on=['Currency', 'Date'], how='left')
df['formatted_created_date_time'] = df['created_date_time'].dt.date
xrate['Date'] = xrate['Date'].dt.date

result = pd.merge(df, xrate, left_on=['currency_str', 'formatted_created_date_time'], 
                             right_on=['Currency', 'Date'], how='left')