我希望将多个商品(此处为:Water,Elec)和区域类型(Com,Ind,Res)的查找表(demand
)与DataFrame(areas
)相乘是这些区域类型的区域表。
import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3],
'Elec':[8,9]}, index=['Com', 'Ind'])
在:
areas
Com Ind
0 1 4
1 2 5
2 3 6
demand
Elec Water
Com 8 4
Ind 9 3
后:
area_demands
Com Ind
Elec Water Elec Water
0 8 4 36 12
1 16 8 45 15
2 24 12 54 18
我的尝试
详细而且不完整;不适用于任意数量的商品。
areas = pd.DataFrame({'area': areas.stack()})
areas.index.names = ['Edge', 'Type']
both = areas.reset_index(1).join(demand, on='Type')
both['Elec'] = both['Elec'] * both['area']
both['Water'] = both['Water'] * both['area']
del both['area']
# almost there; it must be late, I fail to make 'Type' a hierarchical column...
几乎就在那里:
Type Elec Water
Edge
0 Com 8 4
0 Ind 36 12
1 Com 16 8
1 Ind 45 15
2 Com 24 12
2 Ind 54 18
简而言之
如何以合适的方式将DataFrame areas
和demand
加入/相乘?
答案 0 :(得分:4)
import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3],
'Elec':[8,9]}, index=['Com', 'Ind'])
def multiply_by_demand(series):
return demand.ix[series.name].apply(lambda x: x*series).stack()
df = areas.apply(multiply_by_demand).unstack(0)
print(df)
产量
Com Ind
Elec Water Elec Water
0 8 4 36 12
1 16 8 45 15
2 24 12 54 18
这是如何运作的:
首先,看看我们致电areas.apply(foo)
时会发生什么。 foo
逐个传递areas
列:
def foo(series):
print(series)
In [226]: areas.apply(foo)
0 1
1 2
2 3
Name: Com, dtype: int64
0 4
1 5
2 6
Name: Ind, dtype: int64
所以假设series
就是这样一个专栏:
In [230]: series = areas['Com']
In [231]: series
Out[231]:
0 1
1 2
2 3
Name: Com, dtype: int64
我们可以通过这个方式满足要求:
In [229]: demand.ix['Com'].apply(lambda x: x*series)
Out[229]:
0 1 2
Elec 8 16 24
Water 4 8 12
这有一半我们想要的数字,但不是我们想要的形式。
现在apply
需要返回Series
,而不是DataFrame
。将DataFrame
变为Series
的一种方法是使用stack
。看看如果我们会发生什么
stack
此DataFrame。列成为索引的新级别:
In [232]: demand.ix['Com'].apply(lambda x: x*areas['Com']).stack()
Out[232]:
Elec 0 8
1 16
2 24
Water 0 4
1 8
2 12
dtype: int64
因此,使用此作为multiply_by_demand
的返回值,我们得到:
In [235]: areas.apply(multiply_by_demand)
Out[235]:
Com Ind
Elec 0 8 36
1 16 45
2 24 54
Water 0 4 12
1 8 15
2 12 18
现在我们希望索引的第一级成为列。这可以通过unstack
:
In [236]: areas.apply(multiply_by_demand).unstack(0)
Out[236]:
Com Ind
Elec Water Elec Water
0 8 4 36 12
1 16 8 45 15
2 24 12 54 18
根据评论中的请求,这是pivot_table
解决方案:
import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3],
'Elec':[8,9]}, index=['Com', 'Ind'])
areas = pd.DataFrame({'area': areas.stack()})
areas.index.names = ['Edge', 'Type']
both = areas.reset_index(1).join(demand, on='Type')
both['Elec'] = both['Elec'] * both['area']
both['Water'] = both['Water'] * both['area']
both.reset_index(inplace=True)
both = both.pivot_table(values=['Elec', 'Water'], rows='Edge', cols='Type')
both = both.reorder_levels([1,0], axis=1)
both = both.reindex(columns=both.columns[[0,2,1,3]])
print(both)