Question

下面的代码块生成此表：

       Trial Week   Branch  Num_Dep Tot_dep_amt
       1       1      1       4        4200
       1       1      2       7        9000
       1       1      3       6        4800
       1       1      4       6        5800
       1       1      5       5        3800
       1       1      6       4        3200
       1       1      7       3        1600
       .       .      .       .          .
       .       .      .       .          .
       1       1      8       5        6000
       9       19     40      3        2800

代码：

trials=10
dep_amount=[]
branch=41
total=[]
week=1
week_num=[]
branch_num=[]
dep_num=[]
trial_num=[]
weeks=20

df=pd.DataFrame()

for a in range(1,trials):
    print("Starting trial", a)
    for b in range(1,weeks):
        for c in range(1,branch):
            depnum = int(np.round(np.random.normal(5,2,1)/1)*1)
            acc_dep=0
            for d in range(1,depnum):
                dep_amt=int(np.round(np.random.normal(1200,400,1)/200)*200)
                acc_dep=acc_dep+dep_amt
            temp = pd.DataFrame.from_records([{'Trial': a, 'Week': b, 'branch': c,'Num_Dep': depnum, 'Tot_dep_amt':acc_dep }])
            df = pd.concat([df, temp])
            df = df[['Trial', 'Week', 'branch', 'Num_Dep','Tot_dep_amt']]
            df=df.reset_index()
            df=df.drop('index',axis=1)

我希望能够在for循环中将分支分开，而将结果df用标头表示：

Trial   Week   Branch_1_Num_Dep   Branch_1_Tot_dep_amount   Branch_2_Num_ Dep .....etc

我知道可以通过生成DF并执行编码来完成此操作，但是对于此任务，我希望尽可能在for循环中生成它？

Answer 1

为了通过最少的代码更改来实现此目的，可以执行以下操作：


df = pd.DataFrame()
for a in range(1, trials):
    print("Starting trial", a)
    for b in range(1, weeks):
        records = {'Trial': a, 'Week': b}
        for c in range(1, branch):
            depnum = int(np.round(np.random.normal(5, 2, 1) / 1) * 1)
            acc_dep = 0
            for d in range(1, depnum):
                dep_amt = int(np.round(np.random.normal(1200, 400, 1) / 200) * 200)
                acc_dep = acc_dep + dep_amt

            records['Branch_{}_Num_Dep'.format(c)] = depnum
            records['Branch_{}_Tot_dep_amount'.format(c)] = acc_dep
        temp = pd.DataFrame.from_records([records])
        df = pd.concat([df, temp])
        df = df.reset_index()
        df = df.drop('index', axis=1)

总体而言，您似乎可以用更优雅，更快捷的方式来完成您的工作。我建议您将向量化视为一个概念（例如here）。

使用for循环将列添加到熊猫数据框

1 个答案: