具有两个数据帧:机器和需求,如下所示:
机器
import pandas as pd
import numpy as np
import itertools
dates = pd.Series([d.date() for d in pd.date_range('1/1/2018', periods=4, freq='W')])
sites = pd.Series('TH, ID'.split(','))
product = list('AB')
machine = ['M1', 'M2'] # can have multiple machines per site for a product
machines = pd.DataFrame([e for e in itertools.product(dates, sites, product, machine)],
columns=['Date', 'Site', 'Product', 'Machine'])
machines['Capacity'] = np.random.randint(10, 35, size=len(machines))
machines['Week'] = pd.to_datetime(machines.Date).dt.strftime('%Y-%U')
machines.head()
需求
s = 'TH, ID, IN'.split(',')
s = s + s[::-1]
demand_days = pd.date_range('1/1/2018', periods=12*7, freq='D')
demand_days
demand = pd.DataFrame([e for e in itertools.product([d.date() for d in demand_days], s, product)],
columns=['Date', 'Site', 'Product'])
demand['Qty'] = np.random.randint(1,50, len(demand))
demand['Week'] = pd.to_datetime(demand.Date).dt.strftime('%Y-%U')
demand.head()
合并 groupby 以获得需求。数量显示:
pd.merge(demand, machines[['Week', 'Site', 'Product', 'Machine', 'Capacity']], on=['Week', 'Site', 'Product']).set_index('Week').groupby(['Week', 'Site', 'Product']).Qty.sum()[:5]
为 machine.capacity 合并的 groupby 显示:
pd.merge(demand, machines[['Week', 'Site', 'Product', 'Machine', 'Capacity']], on=['Week', 'Site', 'Product']).set_index('Week').groupby(['Week', 'Site', 'Product', 'Machine']).Capacity.sum()[:5]
如何合并机器和需求并创建统一的数据框,例如: