我有一个数据框(名为table),其中6列标记为[price1,price2,price3,time,type,volume]
对于类型,我得到了Q'和' T',安排如下:
Q
Ť
Q
Ť
Ť
Q
现在我想将行与连续的T组合起来并加上卷的值。连续Ts的价格和时间价值相同
即。我想要
价格......:时间:类型:成交量:
10000 2012.05 Q 10
10000 2012.05 T 20
10000 2012.05 Q 10
10000 2012.06 T 20
10000 2012.06 T 30
10000 2012.07 Q 10
是:
10000 2012.05 Q 10
10000 2012.05 T 20
10000 2012.05 Q 10
10000 2012.06 T 20 + 30 = 50
10000 2012.07 Q 10
这是我的代码,但没有返回所需的结果,所以有人可以帮我解决我的错误吗?
def combine(df):
combined = [] # Init empty list
length = len(df.iloc[:,0]) # Get the number of rows in DataFrame
i = 0
while i < length:
num_elements = num_elements_equal(df, i, 0, 'T') # Get the number of consecutive 'T's
if num_elements <= 1: # If there are 1 or less T's, append only that element to combined, with the same type
combined.append([df.iloc[i,0],df.iloc[i,1],df.iloc[i,2],df.iloc[i,3],df.iloc[i,4],df.iloc[i,5]])
else: # Otherwise, append the sum of all the elements to combined, with 'T' type
combined.append(['T', sum_elements(df, i, i+num_elements, 5)])
i += max(num_elements, 1) # Increment i by the number of elements combined, with a min increment of 1
return pd.DataFrame(combined, columns=df.columns) # Return as DataFrame
def num_elements_equal(df, start, column, value): # Counts the number of consecutive elements
i = start
num = 0
while i < len(df.iloc[:,column]):
if df.iloc[i,column] == value:
num += 1
i += 1
else:
return num
return num
def sum_elements(df, start, end, column): # Sums the elements from start to end
return sum(df.iloc[start:end, column])
tableT = combine(table)
tableT
答案 0 :(得分:1)
IIUC:
输入数据帧,df:
Price Time Type Volume
0 10000 2012.05 Q 10
1 10000 2012.05 T 20
2 10000 2012.05 Q 10
3 10000 2012.06 T 20
4 10000 2012.06 T 30
5 10000 2012.07 Q 10
合并T记录和总和量:
df.groupby(by=[df.Type.ne('T').cumsum(),'Price','Time','Type'], as_index=False)['Volume'].sum()
输出:
Price Time Type Volume
0 10000 2012.05 Q 10
1 10000 2012.05 T 20
2 10000 2012.05 Q 10
3 10000 2012.06 T 50
4 10000 2012.07 Q 10