将vars pr iteration保存到df,完成后将df保存到csv

时间:2015-05-23 08:57:12

标签: python csv pandas dataframe

我需要制作一个DataFrame(df_max_res),其中包含我的股票策略中的15个最佳表现以及公司代号(Apple Computer等的AAPL)。我有一个包含500多个股票代码的清单,我可以使用我自己的策略进行分析。

for eachP in perf_array嵌套的内部迭代中,我从策略和股票代码的所有组合中获得了性能结果。我想使用此代码将这些结果保存到DataFrame和csv文件(或更好的建议):

#==============================================================================
#     Saving results in pandas and to a csv-file
#==============================================================================
def saving_res_pandas():
    global df_res, df_max_res
    df_res = pd.DataFrame(columns=('Strategy', 'Ticker', 'Strat', 
                                   'ROI', 'Sharpe R', 'VaR'))
    for eachP in perf_array:
        df_res.loc[len(df_res) + 1] = [strategy, ticker, strat, stratROI]  
    # Select the top 15 of all results (ticker/strategy combo) into new df.
    df_max_res = df_res[:15]           
    # Saving to a csv.
    df_max_res.to_csv('df_performance_data_sp500ish.csv')
    print('After analysing %1.1f Years ~ %d workdays - %d strategies and %d tickers' '\n'
          'The following matrix of tickers and strategies show highest ROI: ' 
          % (years, days, len(strategies), len(stock_list))
         )

    return df_res
#==============================================================================
# Chose which of below methods to save perf-data to disk with
#==============================================================================
saving_res_pandas()

# Reading in df_max_res with best ticker/strategy results
df_max_res = pd.read_csv('df_performance_data_sp500ish.csv')
print(df_max_res)

上面的代码很好地创建了我的DataFrame,但它没有像我期望的那样保存迭代性能结果。

我收到了这个输出:

=======================================================
 aa   ===   <function strategy1 at 0x00000000159A0BF8>   ==
=======================================================


Holdings: 0
Funds: 14659

Starting Valuation:  USD 15000.00 ~ DKK: 100000.50
Current Valuation:   USD 14659.05 ~ DKK: 97727.49

===  aa  == <function strategy1 at 0x00000000159A0BF8> ==
ROI: -1.9 perc. & Annual Profit -1894 DKK  ==
######################################################################

cannot set a row with mismatched columns

== ALL Tickers Done for ==  <function strategy1 at 0x00000000159A0BF8> ==================
Strategy analysis pr ticker - COMPLETE !

Empty DataFrame
Columns: [Unnamed: 0, Strategy, Ticker, ROI, SharpeR, VaR]
Index: []

2 个答案:

答案 0 :(得分:0)

我试图减少您的代码以使其更具可读性:

 1. def saving_res_pandas():
 2.     cols = ('Strategy', 'Ticker', 'Strat', 'ROI', 'Sharpe R', 'VaR')
 3.     df_res = pd.DataFrame(columns=cols)
 4.     for _ in perf_array:
 5.         df_res.loc[len(df_res) + 1] = [strategy, ticker, strat, stratROI]  
 6.     # Select the top x of all results (ticker/strategy combo) into new df.
 7.     df_max_res = df_res[:15]           
 8.     df_max_res.to_csv('df_performance_data_sp500ish.csv')
 9.     print('After analysing {0:.1f} Years ~ {1} workdays - {2} strategies and {3} tickers' '\n'
10.           'The following matrix of tickers and strategies show highest ROI: '  
11.           .format(years, days, len(strategies), len(stock_list)))
12.     return df_res

根据上面的代码,我有两个问题:

  1. 在第5行,strategy, ticker, strat and stratROI
  2. 的值是如何获得的
  3. 在第7行,您将获取df_res的前15项,但DataFrame尚未排序。在原始代码中,下面的排序行被注释掉(因此我在编辑中删除了它。)

    df_res.reset_index().sort(['ROI', 'VaR', 'Sharpe R'], ascending=[0,1,0])

  4. 当你说你想要15个最佳表演时,哪个指标(ROI,Var,Sharpe等)?

答案 1 :(得分:0)

最后,我设法找到了解决问题的正确答案。

我这样解决了:

在for循环之前:

# Creating the df that will save my results in the backtest iterations
cols = ('Strategy','Ticker','ROI')  # ,'Sharpe R','VaR','Strat'
df_res = pd.DataFrame(columns = cols)

for和嵌套for循环

def saving_res_pandas():
    global df_res, df_max_res
    df_res = df_res.append({'Ticker':ticker,'Strategy':strategy, 'ROI':stratROI,}, ignore_index = True)

    return df_res

for循环之外和之后:

        df_res = df_res.sort(['ROI'], ascending=[0])
        df_max_res = df_res.head(15)           # Select the top x of all results (ticker/strategy combo) into new df
        # saving to a csv #
        df_max_res.to_csv('df_performance_data_sp500ish.csv')

    print('After analysing %1.1f Years ~ %d workdays - %d strategies and %d tickers' '\n'
    'The following matrix of tickers and strategies show highest ROI:' %(years, days, len(strategies), len(stock_list))
    )
    print()
    print(df_max_res)

感谢您的帮助和灵感。