熊猫:将数据框列动态添加到循环中正在运行的总数据框中

时间:2020-07-06 22:13:54

标签: python pandas numpy dataframe simpy

我正在编写一个仿真程序,并且试图将每个迭代的结果附加到跟踪所有迭代的数据框中。

尽管一切都可以很好地收集结果,但我找不到每次都将结果附加到新列中的方法。我已经在这个问题上花了很长时间,无法解决这个问题。

我已经构建了一个简化版本,可以最好地解释我的问题:

import simpy
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
import pandas as pd

###dataframe for the simulation
df = pd.DataFrame({'Id' : ['1183', '1187']})
df['average_demand'] = [7426,989]
df['lead_time'] = [1.5, 1.5]
df['sale_price'] = [1.98, 2.01]
df['buy_price'] = [0.11, 0.23]
df['beg_inventory'] = [1544,674]
df['margin'] = df['sale_price'] - df['buy_price']
df['holding_cost'] = 0.2/12
df['aggregate_order_placement_cost'] = 1000
df['review_time'] = 0
df['periods'] = 30
#df['cap_ts'] = 1.5
df['min_ts'] = 1
df['low_demand'] = [300, 30]#,3000,350,220,40,42,40,10,25,240]
df['high_demand'] = [1000, 130]#,12000,700,500,100,90,210,135,200,800]
df['low_sd'] = [160,30]#,3400,100,90,10,5,50,26,45,170]
df['high_sd'] = [400,90]#,5500,200,160,60,50,100,78,113,300]
cap_ts = 0

big_df = pd.DataFrame(df)

for i in df.index:
    for cap_ts in range(1,12, 1):
        def warehouse_run(env, df):

            df['inventory'] = df['beg_inventory']
            df['balance'] = 0.0
            df['quantity_on_order'] = 0
            df['count_order_placed'] = 0
            df['commands_on_order'] = 0
            df['demand'] = 0
            df['safety_stock'] = 0
            df['stockout_occurence'] = 0
            df['inventory_position'] = 0

            while True:
                interarrival = generate_interarrival()
                yield env.timeout(interarrival)
                df['balance'] -= df['inventory'] * df['holding_cost'] * interarrival
                df['demand'] = generate_demand()
                if df['demand'].loc[i] < df['inventory'].loc[i]:
                    df['balance'] += df['sale_price'] * df['demand']
                    df['inventory'] -= df['demand']

                    print('{:.2f} sold {}'.format(env.now, df['demand'].loc[i]))

                else:
                    df['balance'] += df['sale_price'] * df['inventory']
                    df['inventory'] = 0
                    df['stockout_occurence'] += 1
                    print('{:.2f} demand {} but inventory{}'.format(env.now, df['demand'].loc[i], df['inventory'].loc[i]))
                    print('{:.2f} sold {} ( nb stockout)'.format(env.now, df['stockout_occurence'].loc[i]))
                if df['demand'].loc[i] > df['inventory'].loc[i]:

                    env.process(handle_order(env,
                                             df))
                    df['count_order_placed'] += 1
                    print("inventory", df['inventory'].loc[i])
                    print("number of orders placed", df['count_order_placed'].loc[i])

        def handle_order(env, df):

            df['quantity_ordered'] = cap_ts *df['average_demand']
            df['quantity_on_order'] += df['quantity_ordered']
            df['commands_on_order'] += 1
            print("{:.2f} placed order for {}".format(env.now, df['quantity_ordered'].loc[i]))
            df['balance'] -= df['buy_price'] * df['quantity_ordered'] + df['aggregate_order_placement_cost']

            yield env.timeout(df['lead_time'].loc[i], 0)
            df['inventory'] += df['quantity_ordered']
            df['quantity_on_order'] -= df['quantity_ordered']
            df['commands_on_order'] -= 1
            print('{:.2f} receive order,{} in inventory'.format(env.now, df['inventory'].loc[i]))


        # number of orders per month
        def generate_interarrival():
            return np.random.exponential(1. / 1)


        # quantity of demand per months
        def generate_demand():
            return np.random.randint(df['low_demand'].loc[i], df['high_demand'].loc[i])


        def generate_standard_deviation():
            return np.random.randint(df['low_sd'].loc[i], df['high_sd'].loc[i])


        obs_time = []
        inventory_level = []
        demand_level = []
        safety_stock_level = []
        inventory_position_level = []


        def observe(env, df):
            while True:
                obs_time.append(env.now)
                inventory_level.append(df['inventory'].loc[i])
                demand_level.append(df['demand'].loc[i])
                safety_stock_level.append(df['safety_stock'].loc[i])
                inventory_position_level.append(df['inventory_position'].loc[i])
                yield env.timeout(0.1)


        np.random.seed(0)

        env = simpy.Environment()
        env.process(warehouse_run(env, df))
        env.process(observe(env, df))

        # #RUN FOR 12 MONTHS
        env.run(until=36.0)
        recap = pd.DataFrame(df.loc[i])
        recap = recap.transpose()

        #big_df.append(recap)
        big_df['Iteration {}'.format(i)] = recap
        print(recap)

因此在此代码中,问题在于将包含在recap中的结果附加到big_df中。理想情况下,在模拟结束时,big_df应该包含24列,对于模拟的每次迭代,这将是一列结果。对此,我们将不胜感激,谢谢

更新:多亏了wnsfan40,我才能够获得一个可合并每次迭代结果的df,但是big_df会在每次迭代时重置,并且不会连续追加每个新df。

预期的输出看起来像这样:

      Id result_columns
0   11198              x
1   11198              x
2   11198              x
3   11198              x
4   11198              x
5   11198              x
6   11198              x
7   11198              x
8   11198              x
9   11198              x
10  11198              x
11  11198              x
12  11187              y
13  11187              y
14  11187              y
15  11187              y
16  11187              y
17  11187              y
18  11187              y
19  11187              y
20  11187              y
21  11187              y
22  11187              y
23  11187              y

result columns是包含有关每一行结果的所有列的快捷方式。

1 个答案:

答案 0 :(得分:0)

使用初始化时,将df的列分配为big_df的索引

big_df = pd.DataFrame(index = df.index)

尝试从附加更改为分配列值,例如

big_df['Iteration {}'.format(i)] = recap