Question

我有一些函数需要一个DataFrame和一个整数作为参数：

func(df, int)

该函数返回一个新的DataFrame，例如：

df2 = func(df,2)

我想写一个整数2到10的循环，得到9个DataFrame。如果我手动执行此操作，则将如下所示：

df2 = func(df,2) 
df3 = func(df2,3) 
df4 = func(df3,4) 
df5 = func(df4,5) 
df6 = func(df5,6) 
df7 = func(df6,7) 
df8 = func(df7,8) 
df9 = func(df8,9) 
df10 = func(df9,10)

是否可以编写一个执行此操作的循环？

Answer 1

这类东西就是列表的作用。

data_frames = [df]
for i in range(2, 11):
    data_frames.append(func(data_frames[-1], i))

当您看到诸如df1，df2，df3等变量名时，这是代码变脆的标志。当要构建一系列相关对象时，请使用列表。

为澄清起见，此data_frames是可以与data_frames = pd.concat（data_frames，sort = False）串联的DataFrame列表，导致一个DataFrame结合了原始df和循环产生的所有结果，对吗？

是的，是的。如果您的目标是最后一个数据帧，则可以在末尾连接整个列表，以将信息组合到一个帧中。

您介意解释为什么采用列表最后一项的data_frames [-1]返回DataFrame吗？不清楚。

因为在构建列表时，始终每个条目都是一个数据框。 data_frames[-1]求出列表中的最后一个元素，在这种情况下，它是您最近append编辑的数据帧。

Answer 2

您可以将exec与format ted字符串一起使用：

for i in range(2, 11):
    exec("df{0} = func(df{1}, {0})".format(i, i - 1 if i > 2 else ''))

Answer 3

您可以尝试使用itertools.accumulate，如下所示：

样本数据

df:
    a   b   c
0  75  18  17
1  48  56   3

import itertools

def func(x, y):
    return x + y

dfs = list(itertools.accumulate([df] + list(range(2, 11)), func))

[    a   b   c
 0  75  18  17
 1  48  56   3,     a   b   c
 0  77  20  19
 1  50  58   5,     a   b   c
 0  80  23  22
 1  53  61   8,     a   b   c
 0  84  27  26
 1  57  65  12,     a   b   c
 0  89  32  31
 1  62  70  17,     a   b   c
 0  95  38  37
 1  68  76  23,      a   b   c
 0  102  45  44
 1   75  83  30,      a   b   c
 0  110  53  52
 1   83  91  38,      a    b   c
 0  119   62  61
 1   92  100  47,      a    b   c
 0  129   72  71
 1  102  110  57]

dfs是结果数据帧的列表，其中每个数据帧是前一个结果加2-10的结果。

如果要将它们全部concat pd.concat(dfs) Out[29]: a b c 0 75 18 17 1 48 56 3 0 77 20 19 1 50 58 5 0 80 23 22 1 53 61 8 0 84 27 26 1 57 65 12 0 89 32 31 1 62 70 17 0 95 38 37 1 68 76 23 0 102 45 44 1 75 83 30 0 110 53 52 1 83 91 38 0 119 62 61 1 92 100 47 0 129 72 71 1 102 110 57，请使用pd.concat

vector<double> compute(const vector<double>& x) const
{
    // do operations on x and return result
}

vector<double> compute(const vector<double>& x, vector<double> &dx) const
{
    dx = getDerivative();

    compute(x);
}

用Pandas DataFrames循环功能

3 个答案: