如何将迭代(从每个iter。)结果附加到包含所有迭代结果的最终Dataframe?

时间:2016-02-26 10:51:01

标签: python pandas append dataframe concat

我试图将计算矩阵的结果附加到我的df。我有一个问题是如何设计我的迭代计算的大局。我有以下代码,应该举例说明我想要做的事情。

import pandas as pd
from pandas import DataFrame
import numpy as np

np_all = np.array([[1, 'vws.co', 1],
                    [1, 'nflx', 3],
                    [1, 'aapl', 2],
                    [2, 'vws.co', 1],
                    [2, 'nflx', 2],
                    [2, 'aapl', 1],
                    [3, 'vws.co', 1],
                    [3, 'nflx', 3],
                    [3, 'aapl', 1]])


df_all = pd.DataFrame(data=np_all, columns=['Date', 'Ticker', 'Close'])
df_all = df_all.sort(['Ticker','Date'], ascending=[1,1])

df_kpi_list = []
stocklist   = ['vws.co','nflx','aapl']

print (df_all)

def screener(df_all,ticker):

    # Copy df_all to df for single ticker operations
    df = df_all
    # filter to only relevant ticker
    df = df[df['Ticker'] == ticker]
    df = df[df.Ticker == ticker.lower()]


    def kpi1_calc(df,ticker):

        # do some KPI calculation that are appended to new columns of df
        pass

        def kpi2_calc(df,ticker):

            # do more KPI calculation that are appended to new columns of df
            pass

            def kpi3_calc(df,ticker):
                # example of more KPI calculation that are appended to new columns of df


                # Add content to df - RSI
                rsi = 3  # stupid example of a constant that is stored in df column
                r = rsi
                # add a RSI column
                r['RSI'] = rsi
                df_kpi_list.append(r)

                return df
            return df
        return df

    # concatenate all the ticker-iteration dfs from df_kpi_list into one df_all
    df_all = pd.concat(df_kpi_list)

    return df_all

if __name__ == '__main__':
    for ticker in stocklist:
        df_data = screener(df_all, ticker)

    print (df_data)

我有几层增加的数据复杂性:

  1. df_kpi_list = []是一个空的列表,它会附加特定于股票代码的dfs,所以我可以将它们连接到一个新的包含全部的df_all。
  2. df_all是我所有stockinfo的df(时间序列数据库存信息的多个代码信息)
  3. df相同的信息,但现在只过滤到正在迭代的相关自动收报机
  4. 以上df(pr ticker)将为每个kpi [no] _calc函数添加更多信息并添加列 - 并应添加到列表中:df_kpi_list = []
  5. 处理这些计算信息的最聪明方法是什么,最后总结为全包的df_all?

1 个答案:

答案 0 :(得分:0)

确保 df = df_all

复制内容而不仅仅是参考。它可能会在以后搞砸你的计算机。

一般有两种方法:

  1. 随时随地计算总和
  2. 将结果保存在列表中,然后将列表汇总