dataframe附加包含数据

时间:2015-08-27 14:56:48

标签: python pandas

我有一个Panda DataFrame结构,我想添加另一个列,但我无法通过追加,添加或插入来实现。

我尝试使用Panda的内置功能复制投资组合数据,因为如果我请求的时间段低于〜1,此脚本无法提供正确的数据, 5年,如果我愿意,必须连续两天获得数据。所以这是我要重写的脚本:

import QSTK.qstkutil.qsdateutil as du
import QSTK.qstkutil.tsutil as tsu
import QSTK.qstkutil.DataAccess as da

import datetime as dt
import matplotlib.pyplot as plt
import pandas as pd

ls_symbols = ["AAPL", "GLD", "GOOG", "$SPX", "XOM"]
dt_start = dt.datetime(2006, 1, 1)
dt_end = dt.datetime(2010, 12, 31)
dt_timeofday = dt.timedelta(hours=16)
ldt_timestamps = du.getNYSEdays(dt_start, dt_end, dt_timeofday)

c_dataobj = da.DataAccess('Yahoo')
ls_keys = ['open', 'high', 'low', 'close', 'volume', 'actual_close']
ldf_data = c_dataobj.get_data(ldt_timestamps, ls_symbols, ls_keys)
**d_data = dict(zip(ls_keys, ldf_data))**

d_data = dict(zip(ls_keys, ldf_data))是我想要复制的,因为它不会获取我想要的数据,但我需要找到一种方法来向我的dict追加一个新列。这是我的剧本:

from pandas.io.data import DataReader, DataFrame
import QSTK.qstkutil.qsdateutil as du
import QSTK.qstkutil.DataAccess as da

import datetime as dt

def get_historical_data(symbol, source, date_from, date_to):
    global data_validator
    symbol_data = {}

    ls_keys = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']

    for key in ls_keys:
        symbol_data[key] = DataFrame({})

    dataframe_open = DataFrame({})

    for item in symbol:
        print 'Fetching data for:', item
        current_data = DataReader(str(item), source, date_from, date_to)
        dataframe_open = {item : current_data['Open']}
        if len(symbol_data['Open'].columns) == 0:
            symbol_data['Open'] = DataFrame(dataframe_open)
        else:
            **#i want to add the new column here but can't seem to do this.**
            #symbol_data['Open'].loc[:item] = DataFrame(dataframe_open)
            pass
    return symbol_data

P.S。我用这些参数调用func用于测试目的:

test = get_historical_data(['SPY', 'DIA'], 'yahoo', datetime(2015,1,1), datetime(2015,1,31))

1 个答案:

答案 0 :(得分:0)

以下是否有帮助?尚未测试,但原则上应该工作...只需将数据放入相同长度的数组中,然后构建数据框。

def get_historical_data(symbols=[], source=None, date_from=None, date_to=None):
    global data_validator
    symbol_data = {}

    ls_keys = ['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']
    data = []
    for item in ls_keys:
        data.append(DataReader(str(item), source, date_from, date_to)
    symbol_dataframe=DataFrame(data=data, columns=ls_keys)
    #symbol_dataframe = DataFrame()
    #for key in ls_keys:
    #    symbol_data[key] = DataFrame({})

    #dataframe_open = DataFrame({})

    #for item in symbols:
    '''    print 'Fetching data for:', item
        current_data = DataReader(str(item), source, date_from, date_to)
        dataframe_open = {item : current_data['Open']}
        #print(dataframe_open)
        if len(symbol_data['Open'].columns) == 0:
            symbol_data['Open'] = DataFrame(dataframe_open)
        else:
            #i want to add the new column here but can't seem to do this.**
            symbol_data['Open'] = DataFrame(dataframe_open)
            symbol_data.head()
    '''
    return symbol_dataframe