按时间范围Python熊猫对数据框中的值进行分组

时间:2020-03-02 01:02:23

标签: python pandas

试图独自解决这个问题,我只是想不通。我有一些由更多交易组成的交易。我正尝试以10秒的间隔将它们分组在一起。

这就是我的位置。我奠定了我相信的基础,但是最近的尝试(最后一行)返回了“ AttributeError:'Int64Index'对象没有属性'to_period'”。

import pandas as pd
from datetime import timedelta

"""
read csv file
clean date column
convert date str to datetime
sort for equity options
replace date str column with datetime column
"""
trade_reader = pd.read_csv('TastyTrades.csv')
trade_reader['Date'] = trade_reader['Date'].replace({'T': ' ', '-0500': ''}, regex=True)
date_converter = pd.to_datetime(trade_reader['Date'], format="%Y-%m-%d %H:%M:%S")
options_frame = trade_reader.loc[(trade_reader['Instrument Type'] == 'Equity Option')]
clean_frame = options_frame.replace(to_replace=['Date'], value='date_converter')

# Separate opening transaction from closing transactions, combine frames
opens = clean_frame[clean_frame['Action'].isin(['BUY_TO_OPEN', 'SELL_TO_OPEN'])]
closes = clean_frame[clean_frame['Action'].isin(['BUY_TO_CLOSE', 'SELL_TO_CLOSE'])]
open_close_set = set(opens['Symbol']) & set(closes['Symbol'])
open_close_frame = clean_frame[clean_frame['Symbol'].isin(open_close_set)]

# convert Value to float, sort, write
ocf_float = open_close_frame['Value'].astype(float)
ocf_sorted = open_close_frame.sort_values(by=['Symbol', 'Call or Put', 'Date'], ascending=True)
ocf_sorted.to_csv('Sorted.csv')

BTO_frame = opens[opens['Action'].isin(['BUY_TO_OPEN'])]
STO_frame = opens[opens['Action'].isin(['SELL_TO_OPEN'])]
debit_single = []
vertical = []
iron_condor = []
delta = timedelta(seconds=10)

temp_list = BTO_frame.groupby(BTO_frame['Date'].index.to_period(second=10))

我正在使用的示例:

361,2020-01-15 15:27:18,Trade,BUY_TO_OPEN,QQQ   200221P00218000,Equity Option,Bought 1 QQQ 02/21/20 Put 218.00 @ 3.44,-344.00,1.0,-344.00,-1.0,-0.14,100.0,QQQ,2/21/20,218.0,PUT
356,2020-01-17 10:10:27,Trade,SELL_TO_CLOSE,QQQ   200221P00218000,Equity Option,Sold 1 QQQ 02/21/20 Put 218.00 @ 2.26,226.00,1.0,226.00,0.0,-0.15,100.0,QQQ,2/21/20,218.0,PUT
360,2020-01-15 15:27:18,Trade,SELL_TO_OPEN,QQQ   200221P00219000,Equity Option,Sold 1 QQQ 02/21/20 Put 219.00 @ 3.77,377.00,1.0,377.00,-1.0,-0.15,100.0,QQQ,2/21/20,219.0,PUT
357,2020-01-17 10:10:27,Trade,BUY_TO_CLOSE,QQQ   200221P00219000,Equity Option,Bought 1 QQQ 02/21/20 Put 219.00 @ 2.49,-249.00,1.0,-249.00,0.0,-0.14,100.0,QQQ,2/21/20,219.0,PUT
347,2020-01-24 12:28:19,Trade,BUY_TO_OPEN,QQQ   200221P00223000,Equity Option,Bought 1 QQQ 02/21/20 Put 223.00 @ 3.95,-395.00,1.0,-395.00,-1.0,-0.14,100.0,QQQ,2/21/20,223.0,PUT
299,2020-01-30 16:02:56,Trade,SELL_TO_CLOSE,QQQ   200221P00223000,Equity Option,Sold 1 QQQ 02/21/20 Put 223.00 @ 2.91,291.00,1.0,291.00,0.0,-0.15,100.0,QQQ,2/21/20,223.0,PUT
346,2020-01-24 12:28:19,Trade,SELL_TO_OPEN,QQQ   200221P00224000,Equity Option,Sold 1 QQQ 02/21/20 Put 224.00 @ 4.34,434.00,1.0,434.00,-1.0,-0.15,100.0,QQQ,2/21/20,224.0,PUT
300,2020-01-30 16:02:55,Trade,BUY_TO_CLOSE,QQQ   200221P00224000,Equity Option,Bought 1 QQQ 02/21/20 Put 224.00 @ 3.26,-326.00,1.0,-326.00,0.0,-0.14,100.0,QQQ,2/21/20,224.0,PUT
339,2020-01-27 09:56:51,Trade,SELL_TO_OPEN,QQQ   200320C00219000,Equity Option,Sold 1 QQQ 03/20/20 Call 219.00 @ 6.24,624.00,1.0,624.00,-1.0,-0.16,100.0,QQQ,3/20/20,219.0,CALL
15,2020-02-27 15:59:01,Trade,BUY_TO_CLOSE,QQQ   200320C00219000,Equity Option,Bought 1 QQQ 03/20/20 Call 219.00 @ 2.31,-231.00,1.0,-231.00,0.0,-0.14,100.0,QQQ,3/20/20,219.0,CALL
340,2020-01-27 09:56:51,Trade,BUY_TO_OPEN,QQQ   200320C00220000,Equity Option,Bought 1 QQQ 03/20/20 Call 220.00 @ 5.66,-566.00,1.0,-566.00,-1.0,-0.14,100.0,QQQ,3/20/20,220.0,CALL
14,2020-02-27 15:59:01,Trade,SELL_TO_CLOSE,QQQ   200320C00220000,Equity Option,Sold 1 QQQ 03/20/20 Call 220.00 @ 2.01,201.00,1.0,201.00,0.0,-0.15,100.0,QQQ,3/20/20,220.0,CALL

最终结果将把这12笔交易作为3笔交易,当从每个基础交易品种的开市方发现日期时,按10秒的时间范围将其分组。

编辑:

原始数据集示例:

Date,Type,Action,Symbol,Instrument Type,Description,Value,Quantity,Average Price,Commissions,Fees,Multiplier,Underlying Symbol,Expiration Date,Strike Price,Call or Put
2020-02-29T10:09:05-0500,Money Movement,,,,Regulatory fee adjustment,-0.28,0.0,,,0.00,,,,,
2020-02-28T16:00:00-0500,Receive Deliver,,M     200228C00019500,Equity Option,Removal of 3 M 02/28/20 Call 19.50 due to expiration.,0.00,3.0,0.00,,0.00,100,M,2/28/20,19.5,CALL
2020-02-28T15:36:34-0500,Trade,BUY_TO_OPEN,SVXY  200619C00085000,Equity Option,Bought 1 SVXY 06/19/20 Call 85.00 @ 0.06,-6.00,1.0,-6.00,-1.00,-0.14,100,SVXY,6/19/20,85.0,CALL
2020-02-28T15:33:32-0500,Trade,BUY_TO_OPEN,SVXY  200320C00069000,Equity Option,Bought 1 SVXY 03/20/20 Call 69.00 @ 0.15,-15.00,1.0,-15.00,-1.00,-0.14,100,SVXY,3/20/20,69.0,CALL
2020-02-28T12:06:13-0500,Trade,BUY_TO_OPEN,GME   200417C00010000,Equity Option,Bought 10 GME 04/17/20 Call 10.00 @ 0.01,-10.00,10.0,-1.00,-10.00,-1.39,100,GME,4/17/20,10.0,CALL
2020-02-28T12:05:54-0500,Trade,BUY_TO_OPEN,GME   200417C00004500,Equity Option,Bought 1 GME 04/17/20 Call 4.50 @ 0.23,-23.00,1.0,-23.00,-1.00,-0.14,100,GME,4/17/20,4.5,CALL
2020-02-28T10:23:57-0500,Trade,SELL_TO_OPEN,VXX   200417C00025000,Equity Option,Sold 1 VXX 04/17/20 Call 25.00 @ 3.39,339.00,1.0,339.00,-1.00,-0.15,100,VXX,4/17/20,25.0,CALL
2020-02-28T10:23:57-0500,Trade,BUY_TO_OPEN,VXX   200417C00026000,Equity Option,Bought 1 VXX 04/17/20 Call 26.00 @ 3.02,-302.00,1.0,-302.00,-1.00,-0.14,100,VXX,4/17/20,26.0,CALL

1 个答案:

答案 0 :(得分:0)

请尝试看看我是否理解正确。我已将您的数据另存为csv,并将数据框加载为opens。 新的数据框具有以下列。

['Date', 'Type', 'Action', 'Symbol', 'Instrument Type', 'Description',
       'Value', 'Quantity', 'Average Price', 'Commissions', 'Fees',
       'Multiplier', 'Underlying Symbol', 'Expiration Date', 'Strike Price',
       'Call or Put']

我将“日期”列转换为日期时间

 opens['Date']=pd.to_datetime(opens['Date'])

我将“日期列”设置为数据框索引

opens.set_index('Date', inplace=True)

我按动作和符号对数据帧进行分组,同时在10秒的间隔内对索引进行分类。同时,我对间隔内每组的类型进行计数

opens.groupby(["Action","Symbol"]).resample("10S"). apply(lambda x: x['Type'].count())

还是您想要?

opens.groupby(["Action","Symbol"]).resample("10S"). apply(lambda x: x['Type'].count()).unstack()

推向垂直意味着什么?从您的代码中,垂直是一个字典。它的键是什么,值应该是什么?