说我有以下数据框。
import pandas as pd
df = pd.DataFrame()
df['close'] = (7980,7996,8855,8363,8283,8303,8266,8582,8586,8179,8206,7854,8145,8152,8240,8373,8319,8298,8048,8218,8188,8055,8432,8537,9682,10021,9985,10169,10272,10152,10196,10270,10306,10355,10969,10420,10154,10096,10307,10400,10484)
df['A'] = ('TDOWN','TDOWN', 'TDOWN', 'TOP', 'TOP', 'TOP', 'TOP', 'TOP','BUP','BUP','BUP', 'BUP', 'BUP', 'BOTTOM', 'BOTTOM', 'BOTTOM', 'BUP','BUP','BUP','BUP', 'BOTTOM', 'BOTTOM', 'BUP','BUP','BUP', 'BUP','BUP','BUP','BUP', 'BOTTOM', 'BOTTOM', 'BOTTOM', 'BOTTOM','TDOWN','TDOWN', 'TDOWN', 'TOP', 'TOP', 'TOP', 'TOP', 'TOP')
print(df)
对于每组“ TOP”和“ BOTTOM”的收益,我想返回一组“ TOP”的最高数字,而返回一组“ BOTTOM”的最低数字。以下是我想要实现的理想结果
df['outcome1'] = ('-','-', '-', '-', '-', '-', '-', '8582','-','-','-', '-', '-', '8152', '-', '-', '-','-','-','-', '-', '8055', '-','-','-', '-','-','-','-', '10152', '-', '-', '-','-','-', '-', '-', '-', '-', '-', '10848')
您会注意到,“结果1”列中的数字在A列中显示了一些对应的数字。这些数字是“顶部”组中数字最高的部分,是“底部”组中最低的数字。
我该如何编码,这样我才能反映出与“结果1”列相同的结果。
谢谢
答案 0 :(得分:3)
我们可以通过以下操作来实现:
TOP
和BOTTOM
的组max
和min
fillna
合并最大值和最小值grps = (~df['A'].isin(['TOP', 'BOTTOM'])).cumsum()
top = df.where(df['A'].eq('TOP')).groupby(grps)['close'].transform('max')
bottom = df.where(df['A'].eq('BOTTOM')).groupby(grps)['close'].transform('min')
values = top.fillna(bottom)
df['outcome1'] = values.where(values.eq(df['close']), '-')
close A outcome1
0 7980 TDOWN -
1 7996 TDOWN -
2 8855 TDOWN -
3 8363 TOP -
4 8283 TOP -
5 8303 TOP -
6 8266 TOP -
7 8582 TOP 8582
8 8586 BUP -
9 8179 BUP -
10 8206 BUP -
11 7854 BUP -
12 8145 BUP -
13 8152 BOTTOM 8152
14 8240 BOTTOM -
15 8373 BOTTOM -
16 8319 BUP -
17 8298 BUP -
18 8048 BUP -
19 8218 BUP -
20 8188 BOTTOM -
21 8055 BOTTOM 8055
22 8432 BUP -
23 8537 BUP -
24 9682 BUP -
25 10021 BUP -
26 9985 BUP -
27 10169 BUP -
28 10272 BUP -
29 10152 BOTTOM 10152
30 10196 BOTTOM -
31 10270 BOTTOM -
32 10306 BOTTOM -
33 10355 TDOWN -
34 10969 TDOWN -
35 10420 TDOWN -
36 10154 TOP -
37 10096 TOP -
38 10307 TOP -
39 10400 TOP -
40 10484 TOP 10484
答案 1 :(得分:1)
使用:
g = df['A'].ne(df['A'].shift()).cumsum()
df1 = (df.groupby(['A', g])['close']
.agg(['idxmax','idxmin'])
.stack()
.reset_index(level=1, drop=True)
.reset_index(name='idx'))
df1['mask'] = df1['A'].eq('BOTTOM') & df1['level_1'].eq('idxmin') |
df1['A'].eq('TOP') & df1['level_1'].eq('idxmax')
print (df1)
mask = df.index.isin(df1.loc[df1['mask'], 'idx'])
df['new'] = np.where(mask, df['close'], '-')
答案 2 :(得分:0)
替代答案:
# Import libraries
import pandas as pd
# Create DataFrame
df = pd.DataFrame()
df['close'] = (7980,7996,8855,8363,8283,8303,8266,8582,8586,8179,8206,7854,8145,8152,8240,8373,8319,8298,8048,8218,8188,8055,8432,8537,9682,10021,9985,10169,10272,10152,10196,10270,10306,10355,10969,10420,10154,10096,10307,10400,10484)
df['A'] = ('TDOWN','TDOWN', 'TDOWN', 'TOP', 'TOP', 'TOP', 'TOP', 'TOP','BUP','BUP','BUP', 'BUP', 'BUP', 'BOTTOM', 'BOTTOM', 'BOTTOM', 'BUP','BUP','BUP','BUP', 'BOTTOM', 'BOTTOM', 'BUP','BUP','BUP', 'BUP','BUP','BUP','BUP', 'BOTTOM', 'BOTTOM', 'BOTTOM', 'BOTTOM','TDOWN','TDOWN', 'TDOWN', 'TOP', 'TOP', 'TOP', 'TOP', 'TOP')
# Create flags for groups
c = 0
df['flag'] = np.nan
for i in range(df.shape[0]-1):
if(df['A'].iloc[i]==df['A'].iloc[i+1]):
df['flag'].iloc[i+1] = c
else:
c += 1
df['flag'].iloc[i+1] = c
# Create grouped object
g = df.groupby(['flag'], as_index=False)
# Get Highest and Lowest
g_max = g.max()
g_max = g_max[g_max['A']=='TOP']
g_min = g.min()
g_min = g_min[g_min['A']=='BOTTOM']
# Combine Highest and lowest
dfg = pd.concat([g_max, g_min])
dfg = dfg.drop('flag', axis=1)
dfg['outcome1'] = dfg['close']
dfg
# Merge with original DataFrame
dfnew = df.merge(dfg, on=['A','close'], how='left').fillna('-')
输出
dfnew
close A flag outcome1
0 7980 TDOWN - -
1 7996 TDOWN 0 -
2 8855 TDOWN 0 -
3 8363 TOP 1 -
4 8283 TOP 1 -
5 8303 TOP 1 -
6 8266 TOP 1 -
7 8582 TOP 1 8582
8 8586 BUP 2 -
9 8179 BUP 2 -
10 8206 BUP 2 -
11 7854 BUP 2 -
12 8145 BUP 2 -
13 8152 BOTTOM 3 8152
14 8240 BOTTOM 3 -
15 8373 BOTTOM 3 -
16 8319 BUP 4 -
17 8298 BUP 4 -
18 8048 BUP 4 -
19 8218 BUP 4 -
20 8188 BOTTOM 5 -
21 8055 BOTTOM 5 8055
22 8432 BUP 6 -
23 8537 BUP 6 -
24 9682 BUP 6 -
25 10021 BUP 6 -
26 9985 BUP 6 -
27 10169 BUP 6 -
28 10272 BUP 6 -
29 10152 BOTTOM 7 10152
30 10196 BOTTOM 7 -
31 10270 BOTTOM 7 -
32 10306 BOTTOM 7 -
33 10355 TDOWN 8 -
34 10969 TDOWN 8 -
35 10420 TDOWN 8 -
36 10154 TOP 9 -
37 10096 TOP 9 -
38 10307 TOP 9 -
39 10400 TOP 9 -
40 10484 TOP 9 10484