我正在尝试建立一个包含我的投资组合的市场价值的时间序列。整个网站都建立在django框架上。因此数据集将是动态的。
我有一个名为数据集的数据集,该数据集包含股票收盘价:
YAR.OL NHY.OL
date
2000-01-03 NaN 18.550200
2000-01-04 NaN 18.254101
2000-01-05 NaN 17.877100
2000-01-06 NaN 18.523300
2000-01-07 NaN 18.819500
... ... ...
2020-07-27 381.799988 26.350000
2020-07-28 382.399994 26.490000
2020-07-29 377.899994 26.389999
2020-07-30 372.000000 25.049999
2020-07-31 380.700012 25.420000
我有一个名为 positions 的数据框,其中包含用户组合中的职位:
Date Direction Ticker Price ... FX-rate Comission Short Cost-price
0 2020-07-27 Buy YAR.OL 381.0 ... 1.0 0.0 False 381.0
1 2020-07-31 Sell YAR.OL 380.0 ... 1.0 0.0 False -380.0
2 2020-07-28 Buy NHY.OL 26.5 ... 1.0 0.0 False 26.5
位置数据集的代码:
data = zip(date_list, direction_list ,ticker_list,price_list,new_volume_list,exchange_list,commision_list,short_list, cost_price_list)
df = pd.DataFrame(data,columns=['Date','Direction','Ticker','Price','Volume','FX-rate','Comission','Short','Cost-price'])
此外,我已经成功地将位置数据集分为每个股票代码:
dataset = self.dataset_creator(n_ticker_list)
dataset.index = pd.to_datetime(dataset.index)
positions = self.get_all_positions(selected_portfolio)
for ticker in n_ticker_list:
s = positions.loc[positions['Ticker']==ticker]
s = s.sort_values(by='Date')
print(s)
这给了我
Date Direction Ticker Price ... FX-rate Comission Short Cost-price
0 2020-07-27 Buy YAR.OL 381.0 ... 1.0 0.0 False 381.0
1 2020-07-31 Sell YAR.OL 380.0 ... 1.0 0.0 False -380.0
[2 rows x 9 columns]
Date Direction Ticker Price ... FX-rate Comission Short Cost-price
2 2020-07-28 Buy NHY.OL 26.5 ... 1.0 0.0 False 26.5
我已经做到了这是excel,最终目标是创建黄色数据框:
请注意,这是动态的,我使用了两只股票和较短的时间框架来简化创建过程,但也很可能是十只股票
答案 0 :(得分:1)
概述/摘要
from io import StringIO
import pandas as pd
# create data frame with closing prices
data = '''date YAR.OL NHY.OL
2020-07-27 381.799988 26.350000
2020-07-28 382.399994 26.490000
2020-07-29 377.899994 26.389999
2020-07-30 372.000000 25.049999
2020-07-31 380.700012 25.420000
'''
closing_prices = (pd.read_csv(StringIO(data),
sep='\s+', engine='python',
parse_dates=['date']
)
.set_index('date')
.sort_index()
.sort_index(axis=1)
)
print(closing_prices.round(2))
NHY.OL YAR.OL
date
2020-07-27 26.35 381.8
2020-07-28 26.49 382.4
2020-07-29 26.39 377.9
2020-07-30 25.05 372.0
2020-07-31 25.42 380.7
现在创建职位(通过从Excel屏幕截图中输入)。我以为那天每个条目都是买入或卖出。累积总和给出当时的位置。
positions = [
('YAR.OL', '2020-07-27', 1),
('YAR.OL', '2020-07-31', -1),
('NHY.OL', '2020-07-28', 1),
]
# changed cost_price to volume
positions = pd.DataFrame(positions, columns=['tickers', 'date', 'volume'])
positions['date'] = pd.to_datetime(positions['date'])
positions = (positions.pivot(index='date', columns='tickers', values='volume')
.sort_index()
.sort_index(axis=1)
)
positions = positions.reindex( closing_prices.index ).fillna(0).cumsum()
print(positions)
tickers NHY.OL YAR.OL
date
2020-07-27 0.0 1.0 # <-- these are transaction volumes
2020-07-28 1.0 1.0
2020-07-29 1.0 1.0
2020-07-30 1.0 1.0
2020-07-31 1.0 0.0
现在,投资组合价值是头寸乘以收盘价。每只股票都有一列。我们可以使用“ sum(axis = 1)”来计算每天的总和。
port_value = positions * closing_prices
port_value['total'] = port_value.sum(axis=1)
print(port_value.round(2))
tickers NHY.OL YAR.OL total
date
2020-07-27 0.00 381.8 381.80
2020-07-28 26.49 382.4 408.89
2020-07-29 26.39 377.9 404.29
2020-07-30 25.05 372.0 397.05
2020-07-31 25.42 0.0 25.42
更新-进一步工作的建议