填写时间序列DataFrame

时间:2018-03-05 12:50:45

标签: python pandas dataframe time-series

我使用Pandas导入了一个包含时间序列数据的DataFrame。第一列是不完整的DatTimevector,因为时间序列数据仅包括进行交易的数据点。在接下来的四个是价格,在最后三列'交易量','蜱虫数量'和'价值'。我想以下列方式编辑这个DataFrame:我希望填充日期向量,使得时间步长为1分钟,对于所有插入的行,我希望最后三列为零。我用以下代码完成了这个。

def insert_row(idx, df, df_insert):
    dfA = df.iloc[:idx, ]
    dfB = df.iloc[idx:, ]

    df = dfA.append(df_insert).append(dfB).reset_index(drop = True)

    return df

df=pd.read_excel("file_location",skiprows=3,sheet_name='sheet1')
for i in range(1,len(df)):
    while df.iloc[i,0]-df.iloc[i-1,0]>pd.Timedelta('1 minute'):
        df=insert_row(i,df,df.iloc[i-1]+[pd.Timedelta('1minute'),
                0,0,0,0,-df.iloc[i-1,-3],-df.iloc[i-1,-2],-df.iloc[i-,-1]])

有没有人有另一种方法可以做到更好,更有效率?

数据如下:

Time Series Data

所需的输出是: Desired output dataframe

{'Close': {0: 12.65, 1: 12.65, 2: 12.65, 3: 12.65, 4: 12.65},

'Dates': {0: Timestamp('2018-01-08 09:00:00'),
  1: Timestamp('2018-01-08 09:01:00'),
  2: Timestamp('2018-01-08 09:05:00'),
  3: Timestamp('2018-01-08 09:06:00'),
  4: Timestamp('2018-01-08 09:10:00')},

'High': {0: 12.65, 1: 12.65, 2: 12.65, 3: 12.65, 4: 12.65},

'Low': {0: 12.6, 1: 12.65, 2: 12.65, 3: 12.65, 4: 12.65},

'Number_Ticks': {0: 16, 1: 4, 2: 3, 3: 1, 4: 1},

'Open': {0: 12.6, 1: 12.65, 2: 12.65, 3: 12.65, 4: 12.65},

'Value': {0: 83071.8438,
  1: 17279.8984,
  2: 12839.75,
  3: 4263.0498,
  4: 4288.3501},

'Volume': {0: 6568, 1: 1366, 2: 1015, 3: 337, 4: 339}}

1 个答案:

答案 0 :(得分:0)

使用fillnadf2 = df.set_index('Dates').resample('1 min').first() price_df = df2[['Open', 'High', 'Low', 'Close', ]].fillna(method='bfill') volume_df = df2[['Volume', 'Number_Ticks', 'Value',]].fillna(0) result = pd.concat((price_df, volume_df), axis=1) 这应该相当容易

  Open    High    Low     Close   Volume  Number_Ticks    Value
Dates                             
2018-01-08 09:00:00   12.60   12.65   12.60   12.65   6568.0  16.0    83071.8438
2018-01-08 09:01:00   12.65   12.65   12.65   12.65   1366.0  4.0     17279.8984
2018-01-08 09:02:00   12.65   12.65   12.65   12.65   0.0     0.0     0.0000
2018-01-08 09:03:00   12.65   12.65   12.65   12.65   0.0     0.0     0.0000
2018-01-08 09:04:00   12.65   12.65   12.65   12.65   0.0     0.0     0.0000
2018-01-08 09:05:00   12.65   12.65   12.65   12.65   1015.0  3.0     12839.7500
2018-01-08 09:06:00   12.65   12.65   12.65   12.65   337.0   1.0     4263.0498
2018-01-08 09:07:00   12.65   12.65   12.65   12.65   0.0     0.0     0.0000
2018-01-08 09:08:00   12.65   12.65   12.65   12.65   0.0     0.0     0.0000
2018-01-08 09:09:00   12.65   12.65   12.65   12.65   0.0     0.0     0.0000
2018-01-08 09:10:00   12.65   12.65   12.65   12.65   339.0   1.0     4288.3501
{ outer_key1 : [ {key1: some_value},
                {key2: some_value},
                {key3: some_value} ],
  outer_key2 : [ {key1: some_value},
                {key2: some_value},
                {key3: some_value} ] }