两个数据帧:
数据框“价格”包含分钟定价。
ts average
2017-12-13 15:55:00-05:00 339.389
2017-12-13 15:56:00-05:00 339.293
2017-12-13 15:57:00-05:00 339.172
2017-12-13 15:58:00-05:00 339.148
2017-12-13 15:59:00-05:00 339.144
Dataframe'文章'包含文章:
ts title
2017-10-25 11:45:00-04:00 Your Evening Briefing
2017-11-24 14:15:00-05:00 Tesla's Grand Designs Distract From Model 3 Bo...
2017-10-26 11:09:00-04:00 UAW Files Claim That Tesla Fired Workers Who S...
2017-10-25 11:42:00-04:00 Forget the Grid of the Future, Puerto Ricans J...
2017-10-22 09:54:00-04:00 Tesla Reaches Deal for Shanghai Facility, WSJ ...
当“文章”发生时,我想要当前的平均股票价格(简单),加上当天结束时的股票价格(问题)。
我目前的做法:
articles['t-eod'] = prices.loc[articles.index.strftime('%Y-%m-%d')[0]].between_time('15:30','15:31')
但是,它会发出警告:
/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
"""Entry point for launching an IPython kernel.
阅读文档并没有让我更清楚。
所以问题:对于每篇文章,我如何才能获得当天价格的最后平均价格?
谢谢!
/莫里斯
答案 0 :(得分:1)
您可以尝试在idxmax
上使用ts
来确定该日期的最大时间戳索引,并使用loc
#Reset our index
prices_df.reset_index(inplace=True)
articles_df.reset_index(inplace=True)
#Ensure our ts field is datetime
prices_df['ts'] = pd.to_datetime(prices_df['ts'])
articles_df['ts'] = pd.to_datetime(articles_df['ts'])
#Get maximum average value from price_df by date
df_max = prices_df.loc[prices_df.groupby(prices_df.ts.dt.date, as_index=False).ts.idxmax()]
#We need to join df_max and articles on the date so we make a new index
df_max['date'] = df_max.ts.dt.date
articles_df['date'] = articles_df.ts.dt.date
df_max.set_index('date',inplace=True)
articles_df.set_index('date',inplace=True)
#Set our max field
articles_df['max'] = df_max['average']
articles_df.set_index('ts',inplace=True)