使用熊猫数据框分析时间序列数据

时间:2020-05-11 12:54:42

标签: python pandas

我有一些下面的时间序列数据,我想对此做一些具体分析

"timestamp","epic","closeprice_bid","closeprice_ask","last_traded_volume"
"2020-03-24 12:00:00","KA.D.BARC.DAILY.IP","91.17","91.38","7836277"
"2020-03-24 13:00:00","KA.D.BARC.DAILY.IP","90.33","90.66","8001075"
"2020-03-24 14:00:00","KA.D.BARC.DAILY.IP","89.96","90.22","11490520"
"2020-03-24 15:00:00","KA.D.BARC.DAILY.IP","91.62","91.89","9014323"
"2020-03-24 16:00:00","KA.D.BARC.DAILY.IP","93.84","94.23","7270054"
"2020-03-24 16:00:00","KA.D.BARC.DAILY.IP","93.84","94.23","7270054.0"
"2020-03-25 08:00:00","KA.D.BARC.DAILY.IP","109.47","109.89","25414762.0"
"2020-03-25 08:00:00","KA.D.BARC.DAILY.IP","109.47","109.89","25414762

我想模拟一种基本的交易策略,借此使用pandas数据框,我可以通过以下方式分析时间序列数据:1)检查前几天closeprice_bid与今天的前一日之间是否存在≥1%或≤1%的差异首先closeprice_bid 2)每隔一小时的数据检查closeprice_bid是今天开盘的closeprice_bid的≥3%或≤3%。

有人可以提供一些有关如何使用熊猫进行上述分析的指南

我已使用以下代码将数据加载到df中:

cols = ['timestamp', 'epic', 'closeprice_bid', 'closeprice_ask','last_traded_volume']
stock_data = pd.read_csv('barc.csv', header=0, names=cols)
stock_data['closeprice_bid'] = pd.to_numeric(stock_data['closeprice_bid'], errors='coerce')

2 个答案:

答案 0 :(得分:1)

您可以执行以下操作:

Rectangle {
    color: "transparent"
    border.color: "white"
    border.width: 2

    Image {
        source: "my_image.png"
    }
}

和每小时类似的情况。

答案 1 :(得分:0)

您可以尝试以下操作:

df['date'] = pd.to_datetime(df['timestamp']).dt.date
df['time'] = pd.to_datetime(df['timestamp']).dt.time
df.sort_values(by=['date', 'time'], inplace=True)
first = df.groupby(by='date').first()['closeprice_bid']
df_first = first.to_frame().reset_index()
last = df.groupby(by='date').last()['closeprice_bid']
df_last = last.to_frame().reset_index()

df_merged = df_first.merge(df_last, left_on=['date'], right_on=['date'])
df_merged.set_index(['date'], inplace=True)
df_merged['pct_change'] = df_merged.pct_change(axis=1)['closeprice_bid_y']
df_merged['greater_than_1'] = df_merged['pct_change'] > 1
print(df_merged)

            closeprice_bid_x  closeprice_bid_y  pct_change  greater_than_1
date
2020-03-24             91.17             93.84    0.029286           False
2020-03-25            109.47            109.47    0.000000           False