基于第一个数据帧
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
date_today = datetime.now()
days = pd.date_range(date_today, date_today + timedelta(1), freq='D')
symbols = ['A','B']
np.random.seed(seed=1111)
dataA = np.random.randint(1, high=100, size=len(days))
dataB = np.random.randint(1, high=100, size=len(days))
df1 = pd.DataFrame({symbols[0]: dataA,symbols[1] :dataB },index=days)
print(df1)
A B
2019-05-20 06:52:21.013198 29 82
2019-05-21 06:52:21.013198 56 13
和第二个数据帧
df2 = pd.DataFrame({'const1': [1,2],'const2' : [2,3] },index=['A','B'])
print(df2)
const1 const2
A 1 2
B 2 3
我想计算与第一个数据帧具有相同结构的第三个数据帧,其中每个单元格是合并前两个数据帧的值的计算结果。
以下代码正确计算了第三个数据帧的每个单元格:
df3 = df1
for symbol in symbols:
const1 = df2.at[symbol,'const1']
const2 = df2.at[symbol,'const2']
for index, row in df1.iterrows():
value = df1.at[index,symbol]
df3.at[index,symbol] = const1*value + const2*value
print(df3)
A B
2019-05-20 06:58:52.753879 87 410
2019-05-21 06:58:52.753879 168 65
如何摆脱丑陋的循环并更有效地进行计算?
答案 0 :(得分:0)
尝试使用:
df3 = df1 * df2.sum(axis=1)
现在:
print(df3)
是:
A B
2019-05-20 06:58:52.753879 87 410
2019-05-21 06:58:52.753879 168 65
答案 1 :(得分:0)
如果可能,某些符号不匹配:
date_today = datetime.now()
days = pd.date_range(date_today, date_today + timedelta(1), freq='D')
symbols = ['A','C']
np.random.seed(seed=1111)
dataA = np.random.randint(1, high=100, size=len(days))
dataB = np.random.randint(1, high=100, size=len(days))
df1 = pd.DataFrame({symbols[0]: dataA,symbols[1] :dataB },index=days)
print(df1)
A C
2019-05-20 09:24:33.383637 29 82
2019-05-21 09:24:33.383637 56 13
df2 = pd.DataFrame({'const1': [1,2],'const2' : [2,3] },index=['A','B'])
print(df2)
const1 const2
A 1 2
B 2 3
df3 = df1.mul(df2.sum(axis=1).reindex(df1.columns, fill_value=1))
print (df3)
A C
2019-05-20 09:25:48.075084 87 82
2019-05-21 09:25:48.075084 168 13
因为:
df3 = df1.mul(df2.sum(axis=1), fill_value=1)
NotImplementedError:不支持fill_value 1。