如果你帮我解决这个问题,请提前感谢你。我想要完成的是在相同的日期使用另一个数据帧(indexed_orders)更新一个带有日期时间索引(我的交易数据帧)的零填充的数据框。我的代码如下:
import pandas as pd
import numpy as np
import os
import csv
orders = pd.read_csv('./orders/orders.csv', parse_dates=True, sep=',', dayfirst=True) #initiate orders data frame from csv data file
indexed_orders = orders.set_index(['Date']) #set Date as index for orders
print indexed_orders
symbol_list = orders['Symbol'].tolist() #creates list of symbols
symbols = list(set(symbol_list)) #gets rid of duplicates in list
dates_list = orders['Date'].tolist() #creates list of order dates
dates_orders = list(set(dates_list)) #gets rid of duplicates in list
start_date = '2011-01-05' #establish date range
end_date = '2011-01-20'
dates = pd.date_range(start_date, end_date) #establish dates from start_date and end_date
trade = pd.DataFrame(0, index = dates, columns = symbols) #establish trade data frame
trade['Cash'] = 0 #add column for future calculations
print trade
indexed_orders的哪些输出:
Date Symbol Order Shares
2011-01-10 AAPL BUY 1500
2011-01-13 AAPL SELL 1500
2011-01-13 IBM BUY 4000
2011-01-26 GOOG BUY 1000
2011-02-02 XOM SELL 4000
2011-02-10 XOM BUY 4000
2011-03-03 GOOG SELL 1000
2011-03-03 IBM SELL 2200
2011-06-03 IBM SELL 3300
2011-05-03 IBM BUY 1500
2011-06-10 AAPL BUY 1200
2011-08-01 GOOG BUY 55
2011-08-01 GOOG SELL 55
2011-12-20 AAPL SELL 1200
并为交易输出以下内容:
GOOG AAPL XOM IBM Cash
2011-01-05 0 0 0 0 0
2011-01-06 0 0 0 0 0
2011-01-07 0 0 0 0 0
2011-01-08 0 0 0 0 0
2011-01-09 0 0 0 0 0
2011-01-10 0 0 0 0 0
2011-01-11 0 0 0 0 0
2011-01-12 0 0 0 0 0
2011-01-13 0 0 0 0 0
2011-01-14 0 0 0 0 0
2011-01-15 0 0 0 0 0
2011-01-16 0 0 0 0 0
2011-01-17 0 0 0 0 0
2011-01-18 0 0 0 0 0
2011-01-19 0 0 0 0 0
2011-01-20 0 0 0 0 0
我想在我的idexed_orders中的日期更新我的交易数据框,插入'股票的数量'在正确的'符号'下的列中(这是交易中的AAPL,IBM,GOOG和XOM名称)。我也想要' Shares'当“订单”时为负面。 indexed_orders中的列指定'卖出'。换句话说,我试图提出更新交易数据框的代码,以便: 印刷贸易
GOOG AAPL XOM IBM Cash
2011-01-05 0 0 0 0 0
2011-01-06 0 0 0 0 0
2011-01-07 0 0 0 0 0
2011-01-08 0 0 0 0 0
2011-01-09 0 0 0 0 0
2011-01-10 0 1500 0 0 0
2011-01-11 0 0 0 0 0
2011-01-12 0 0 0 0 0
2011-01-13 0 -1500 0 4000 0
2011-01-14 0 0 0 0 0
2011-01-15 0 0 0 0 0
2011-01-16 0 0 0 0 0
2011-01-17 0 0 0 0 0
2011-01-18 0 0 0 0 0
2011-01-19 0 0 0 0 0
2011-01-20 0 0 0 0 0
我正在考虑使用嵌套的布尔语句进行某种迭代,但我肯定很难找到一个。特别是,我很难想出一种方法来整理行并根据索引的日期时间进行更新。
任何帮助都会非常感激。
答案 0 :(得分:1)
首先,您可以使用Order
列来签署共享更改。然后,您可以按Date
和Symbol
进行分组,并通过汇总订单进行汇总。这将为您提供所有唯一日期的Series
个订单以及当天交易的Symbols
。最后,使用unstack
将Series
转换为表格格式。
import numpy as np
import pandas as pd
df = pd.io.parsers.read_csv('temp.txt', sep = '\t')
print df
'''
Date Symbol Order Shares
0 1/10/11 AAPL BUY 1500
1 1/13/11 AAPL SELL 1500
2 1/13/11 IBM BUY 4000
3 1/26/11 GOOG BUY 1000
4 2/2/11 XOM SELL 4000
5 2/10/11 XOM BUY 4000
6 3/3/11 GOOG SELL 1000
7 3/3/11 IBM SELL 2200
8 6/3/11 IBM SELL 3300
9 5/3/11 IBM BUY 1500
10 6/10/11 AAPL BUY 1200
11 8/1/11 GOOG BUY 55
12 8/1/11 GOOG SELL 55
13 12/20/11 AAPL SELL 1200
'''
df['SharesChange'] = df.Shares * df.Order.apply(lambda o: 1 if o == 'BUY' else -1)
df = df.groupby(['Date', 'Symbol']).agg({'SharesChange' : np.sum}).unstack().fillna(0)
print df
'''
SharesChange
Symbol AAPL GOOG IBM XOM
Date
1/10/11 1500 0 0 0
1/13/11 -1500 0 4000 0
1/26/11 0 1000 0 0
12/20/11 -1200 0 0 0
2/10/11 0 0 0 4000
2/2/11 0 0 0 -4000
3/3/11 0 -1000 -2200 0
5/3/11 0 0 1500 0
6/10/11 1200 0 0 0
6/3/11 0 0 -3300 0
8/1/11 0 0 0 0
'''