计算按列移位或在列上循环的值

时间:2017-12-13 00:10:34

标签: python python-3.x pandas loops

我有以下数据框,首先为每个队列计算以下数学运算年+ n /年。值== 2009然后执行每个队列的均值

df
             id                                                        
year       2009     2010     2011     2012     2013     2014     2015   
cohort                                                                  
2009.0  72092.0  60513.0  48797.0  40968.0  34919.0  30452.0  26961.0   
2010.0      NaN  73735.0  61899.0  50263.0  42184.0  36150.0  31516.0   
2011.0      NaN      NaN  76809.0  64093.0  51372.0  43277.0  36994.0   
2012.0      NaN      NaN      NaN  69776.0  57621.0  46453.0  39098.0   
2013.0      NaN      NaN      NaN      NaN  71613.0  58996.0  47657.0   
2014.0      NaN      NaN      NaN      NaN      NaN  65430.0  52540.0   
2015.0      NaN      NaN      NaN      NaN      NaN      NaN  67121.0   
2016.0      NaN      NaN      NaN      NaN      NaN      NaN      NaN   
2017.0      NaN      NaN      NaN      NaN      NaN      NaN      NaN  

我将展示我想要执行的数学运算,因为我的英语不好而且数学是一种通用语言:)

自2009年起每1年过去一次:(n = 1)

需要的第一个值=((60513.0 / 72092.0)+(61899.0 / 73735.0)+(64093.0 + 76809.0)+(57621.0 / 69776.0)+(58996.0 + 71613.0)+(52540.0 / 65430.0))/ 6

自2009年起每2年通过一次:(n = 2)

所需的第二个值=((48797.0 / 72092.0)+(50263.0 / 73735.0)+(51372.0 / 76809.0)+(46453.0 / 69776.0)+(47657.0 / 71613.0))/ 5

自从2009年以来每3年通过一次:(n = 3)(最后一个,我认为这个我想做的循环将会理解)

需要的第三个值=((40968.0 / 72092.0)+(42184.0 / 73735.0)+(43277.0 / 76809.0) +(39098.0 / 69776.0))/ 4

依此类推,直到最后一个值为

最后一个值= 26961.0 / 72092.0

提前致谢并抱歉我的英文

我正在尝试这样的事情,也许它可以提供帮助

第一个值:

 ((df1.iloc[0,1]/df1.iloc[0,0]) + (df1.iloc[1,2]/df1.iloc[1,1]) + 
 (df1.iloc[2,3]/df1.iloc[2,2]) + (df1.iloc[3,4]/df1.iloc[3,3]) + 
 (df1.iloc[4,5]/df1.iloc[4,4]) + (df1.iloc[5,6]/df1.iloc[5,5]))/6

第二个值:

 ((df1.iloc[0,2]/df1.iloc[0,0]) + (df1.iloc[1,3]/df1.iloc[1,1]) + 
 (df1.iloc[2,4]/df1.iloc[2,2]) + (df1.iloc[3,5]/df1.iloc[3,3]) + 
 (df1.iloc[4,6]/df1.iloc[4,4]))/5

第三个价值:

 ((df1.iloc[0,3]/df1.iloc[0,0]) + (df1.iloc[1,4]/df1.iloc[1,1]) + 
 (df1.iloc[2,5]/df1.iloc[2,2]) + (df1.iloc[3,6]/df1.iloc[3,3]))/4

这样的东西,但有一个循环

2 个答案:

答案 0 :(得分:0)

这个怎么样?

----更新----

import numpy as np

def sum_with_shift(df, n):
    row_values = []
    for i, row in df.iterrows():
        if (i + n - 1) < df.columns.max():  
            row_values += [row[i] / row[i + n]]

    if row_values:
        return np.mean(row_values)
    else:
        return 0

传递您的dfn=1

sum_with_shift(df, 1)

72092.0 / 60513.0
73735.0 / 61899.0
76809.0 / 64093.0
69776.0 / 57621.0
71613.0 / 58996.0
65430.0 / 52540.0

130852.83333333333

传递您的dfn=2

sum_with_shift(df, 2)

72092.0 / 48797.0
73735.0 / 50263.0
76809.0 / 51372.0
69776.0 / 46453.0
71613.0 / 47657.0

121713.39999999999

----更新----

要获得重现性,请尝试运行以下代码以生成df

df_as_json = '{"2009":{"2009":72092.0,"2010":null,"2011":null,"2012":null,"2013":null,"2014":null,"2015":null,"2016":null,"2017":null},"2010":{"2009":60513.0,"2010":73735.0,"2011":null,"2012":null,"2013":null,"2014":null,"2015":null,"2016":null,"2017":null},"2011":{"2009":48797.0,"2010":61899.0,"2011":76809.0,"2012":null,"2013":null,"2014":null,"2015":null,"2016":null,"2017":null},"2012":{"2009":40968.0,"2010":50263.0,"2011":64093.0,"2012":69776.0,"2013":null,"2014":null,"2015":null,"2016":null,"2017":null},"2013":{"2009":34919.0,"2010":42184.0,"2011":51372.0,"2012":57621.0,"2013":71613.0,"2014":null,"2015":null,"2016":null,"2017":null},"2014":{"2009":30452.0,"2010":36150.0,"2011":43277.0,"2012":46453.0,"2013":58996.0,"2014":65430.0,"2015":null,"2016":null,"2017":null},"2015":{"2009":26961.0,"2010":31516.0,"2011":36994.0,"2012":39098.0,"2013":47657.0,"2014":52540.0,"2015":67121.0,"2016":null,"2017":null}}'

df = pd.read_json(df_as_json)

答案 1 :(得分:0)

所以看起来你正试图从表格的第一年(列)到最后一年(列)进行迭代。然后,在你的数学中,除了你当前迭代到去年的那一年,你基本上都在做同样的事情。看起来你需要一个循环

numcols = 6 # Set this to the correct value
for year in range(0, numcols-1):
    count = numcols - year
    sum = 0
    for x in range(year, numcols-1):
        sum += df1.iloc[x-year,1+x]/df1.iloc[x-year,x-year]
    print("Answer for this year is: {}".format(sum/count))