我目前正在处理由许多数据帧组成的数据库。我想在这些数据框之一中添加一列。我试图在日期上添加条件,以便在df1中为该日期添加该日期在df2中的信息。紧接着,仅当df1中的ID与df2中的ID相同时,它才必须添加该信息。我正在努力增加这一条件。有人可以帮忙吗,我在这个问题上坚持了太久了。 这是我到目前为止的内容:
conditions = [
(df1['colDate'] < '2015-11-30'),
(df1['colDate'] >= '2015-11-30') & (df1['colDate'] < '2016-11-30'),
(df1['colDate'] >= '2016-11-30') & (df1['colDate'] < '2017-11-30'),
(df1['colDate'] >= '2017-11-30') & (df1['colDate'] < '2018-11-30'),
(df1['colDate'] >= '2018-11-30') & (df1['colDate'] < '2019-12-02'),
(df1['colDate'] >= '2019-12-02') & (df1['colDate'] < '2020-06-03'),
(df1['colDate'] >= '2020-06-03')
]
#create the list of values to apply for each condition
scores = [df2.loc[df2['colDate'] == '2014-11-30' & df1['colId'] == df2['colId'], 'colScore'],
df2.loc[df2['colDate'] == '2015-11-30' & df1['colId'] == df2['colId'], 'colScore'],
df2.loc[df2['colDate'] == '2016-11-30' & df1['colId'] == df2['colId'], 'colScore'],
df2.loc[df2['colDate'] == '2017-11-30' & df1['colId'] == df2['colId'], 'colScore'],
df2.loc[df2['colDate'] == '2018-11-30' & df1['colId'] == df2['colId'], 'colScore'],
df2.loc[df2['colDate'] == '2019-12-02' & df1['colId'] == df2['colId'], 'colScore'],
df2.loc[df2['colDate'] == '2020-06-03' & df1['colId'] == df2['colId'], 'colScore']]
#create the new column in Index_data
df1['score']=np.select(conditions, scores)
并返回该错误
TypeError: Cannot perform 'rand_' with a dtyped [object] array and scalar of type [bool]
谢谢!