Question

我对python很陌生，因此，非常感谢您的评论和解释。我有一个包含40000个条目的数据框：

id              40000 non-null int64
feature_1        40000 non-null float64
feature_2        40000 non-null float64
feature_3        40000 non-null float64
feature_4        40000 non-null float64

我需要根据方程式使用每个特征的系数c_n计算每个id数：

eq_n=feature_1*c_1+feature_2*c_2+feature_3*c_3+feature_4*c_4

c_n可以从0到1，步长为0.1（0,0.1,0.2，... 1）因此，组合数将为11^4：由于步数（0,0.1，... 1）而为11，由于4个要素而为4。

我认为我需要先使用系数创建4d数组，然后使用循环进行进一步的计算。但是我坚持使用这些系数创建4d矩阵并将其填充的过程。我尝试使用np.zeros([11,4,11,4])创建矩阵，但是我不确定我是否根据11^4的要求正确选择了维度索引，而且我不太清楚如何用必需的元素。

我从一个简单的配置开始通过创建零数组并循环更改它，但是肯定需要进行调整，因为它涵盖的组合数量少得多。请在下面查看我的操作：

M=df # dataframe without Id column for simplicity 
# calc is the name of the function that makes further calculations using 
#the product of arrays 
K=[0,0,0,0] # coefficient array
J=[0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1] # steps for coefficients
ind=0 # for assigning a new element to a coefficient array
for i in K:
        for z in j:
            K[ind]=z
            calc(prod=K*M)
            print(prod)
  ind=ind+1

Answer 1

Volume

不幸的是，这可能会给您MemoryErrors（对我来说确实如此），因为最终输出非常大。 11 ** 4 * 40000 * 64位= 4.5GB

Answer 2

我决定放弃使用4d数组的想法，然后想出了一个更简单的算法：

计算出组合的数量，并使用这些组合创建一个二维数组。

正如我之前提到的，组合的数量为11 ** 4
因此，接下来的事情是使用以下函数来获取每个组合：

def combinations(n, m): steps = np.arange(0,1.1,0.1) qty_of_combs = n**m combs = np.zeros((qty_of_combs, m), dtype=float) for i in range(m): #for each column k = n**i q = 0 while (q < qty_of_combs): for z in range(n): for j in range(k): combs[q, i] = steps[z] q += 1 return combs
最后一步是使用每个组合计算输出。

comb_ar=combinations(11,4) for i in range(comb_ar.shape[0]): output=comb_ar[i,:]*df

此处的df仅包含要素列，因此我们可以计算数组的乘积。

如何在python中正确填充4d数组/矩阵？

2 个答案: