我有两个数据帧。我希望加权总和,如下:
ws[0] = a1[0]*c1[0] + a2[0]*c2[0] + a3[0]*c3[0] + ...
ws[1] = a1[1]*c2[1] + a2[1]*c2[1] + a3[1]*c3[1] + ...
ws[2] = a1[2]*c2[2] + a2[2]*c2[1] + a3[1]*c3[1] + ...
...
索引是日期。困难在于
因此,如果缺少相应的标准,则应采用先前的标准。如果标准日期索引>
替代日期索引则
我们假设我们有两个数据帧:
alternatives
数据框:
date a1 a2 a3
2018-01-01 10.00 10.00 10.00
2018-01-02 20.00 10.00 10.00
2018-01-03 20.00 20.00 20.00
2018-01-13 10.00 20.00 30.00
criteria
数据框:
date c1 c2 c3
2018-01-01 0.10 1.00 1.00
2018-01-05 1.00 0.50 1.00
2018-01-13 1.00 1.00 1.00
所以结果应该是:
date ws
2018-01-01 21.00 # alternative date == criteria date, all systems nominal
2018-01-02 22.00 # criteria date > alternative date, taking 2018-01-02 alternative && 2018-01-01 criteria
2018-01-03 42.00 # criteria date > alternative date, taking 2018-01-03 alternative && 2018-01-01 criteria
2018-01-13 60.00 # alternative date == criteria date
亲爱的熊猫魔术师,请帮忙。
答案 0 :(得分:1)
我是否正确,您想要结果,包括来自alternatives
的所有日期,但如果日期不是替代品,则应删除它?如果是这样,这是解决方案:
alternatives_dates = pd.DatetimeIndex(['20180101', '20180102', '20180103',
'20180113'])
criteria_dates = pd.DatetimeIndex(['20180101', '20180105', '20180113'])
alternatives = pd.DataFrame(
index=alternatives_dates, columns=['a1', 'a2', 'a3'],
data=[[10, 10, 10],[20, 10, 10], [20, 20, 20], [10,20,30]]
)
criteria = pd.DataFrame(
index=criteria_dates, columns=['c1', 'c2', 'c3'],
data=[[0.1, 1, 1],[1, 0.5, 1], [1, 1, 1]]
)
merged = (alternatives.merge(criteria, how='outer', left_index=True, right_index=True)
.dropna(subset=['a1', 'a2', 'a3'])
.fillna(method='ffill'))
result = merged['a1']*merged['c1'] + \
merged['a2']*merged['c2'] + \
merged['a3']*merged['c3']
print(result)
# date
# 2018-01-01 21.0
# 2018-01-02 22.0
# 2018-01-03 42.0
# 2018-01-13 60.0
# dtype: float64