我有一个报价数据帧,其中的 delta 列标识了“ true”中的行。我想在“假”低点之间选择“真”中的那些,并在“低”点中找到最低的,而将那些不是。最后删除“ false”中的所有行。
symbol open high low close adjusted volume delta
date
2017-11-06 TOU 23.70 25.09 23.70 25.07 24.7563 999400 False
2017-11-07 TOU 25.10 25.25 24.73 24.77 24.4600 546500 True
2017-11-08 TOU 24.75 25.16 24.41 24.90 24.5884 450000 True
2017-11-09 TOU 25.36 27.26 25.30 26.83 26.4942 2347500 False
2017-11-10 TOU 26.70 27.01 26.45 26.81 26.4745 903600 False
2017-11-13 TOU 26.76 26.85 26.10 26.40 26.0696 733200 False
2017-11-14 TOU 26.30 26.41 25.37 25.48 25.1611 619300 False
2017-11-15 TOU 25.22 25.27 24.72 24.74 24.4304 525800 False
2017-11-16 TOU 24.69 24.90 24.33 24.34 24.0354 516000 True
2017-11-17 TOU 24.67 24.86 23.98 24.00 23.6997 1233100 True
2017-11-20 TOU 24.01 24.03 23.68 23.70 23.4034 977800 True
2017-11-21 TOU 23.86 23.98 23.35 23.46 23.1664 544300 True
... ... ... ... ... ... ... ... ...
2018-09-21 TOU 21.11 21.30 20.91 20.99 20.9900 1235800 True
2018-09-24 TOU 21.19 21.72 21.19 21.66 21.6600 995800 False
2018-09-25 TOU 21.83 21.83 21.45 21.45 21.4500 574100 False
2018-09-26 TOU 21.38 21.65 20.88 20.97 20.9700 791600 True
2018-09-27 TOU 21.36 22.69 21.23 22.67 22.6700 1192500 False
2018-09-28 TOU 22.58 23.27 22.29 22.74 22.7400 1376300 False
2018-10-01 TOU 23.15 23.86 22.75 23.01 23.0100 1137200 False
2018-10-02 TOU 23.05 23.05 22.51 22.59 22.5900 801600 False
2018-10-03 TOU 22.65 23.59 22.43 23.52 23.5200 1391100 False
2018-10-04 TOU 23.35 23.35 22.39 22.47 22.4700 1272900 False
2018-10-05 TOU 22.62 22.66 22.19 22.62 22.6200 668300 False
2018-10-09 TOU 22.70 23.44 22.53 23.41 23.4100 832800 False
2018-10-10 TOU 23.38 23.38 22.27 22.30 22.3000 1435300 False
2018-10-11 TOU 21.84 22.08 21.16 21.28 21.2800 1127700 False
2018-10-12 TOU 21.78 21.80 21.12 21.18 21.1800 887300 True
2018-10-15 TOU 21.32 21.42 20.58 20.68 20.6800 852300 True
2018-10-16 TOU 20.80 20.80 20.34 20.44 20.4400 1115200 True
2018-10-17 TOU 20.38 20.48 20.03 20.09 20.0900 700900 True
2018-10-18 TOU 20.00 20.01 19.32 19.50 19.5000 1188600 True
2018-10-19 TOU 19.59 20.15 19.57 19.94 19.9400 1321600 True
2018-10-22 TOU 19.96 20.08 19.73 19.80 19.8000 828200 True
通过以下代码大致解决了该问题:
tdf = tdf.reset_index()
temp = tdf.iloc[0:0]
final = tdf.iloc[0:0]
for index, row in tdf.iterrows():
if(row.delta == False):
temp = temp[temp.low == temp.low.min()]
final = final.append(temp)
temp = temp.iloc[0:0]
elif(row.delta == True):
temp = temp.append(row)
if(row.date == tdf.iloc[-1].date):
temp = temp[temp.low == temp.low.min()]
final = final.append(temp)
temp = temp.iloc[0:0]
df final的结果:
如果我错了,请纠正我!
答案 0 :(得分:0)
这是我要进行的操作:
#creating the dataframe
import pandas as pd
from io import StringIO
s = '''date symbol open high low close adjusted volume delta
2017-11-06 TOU 23.70 25.09 23.70 25.07 24.7563 999400 False
2017-11-07 TOU 25.10 25.25 24.73 24.77 24.4600 546500 True
2017-11-08 TOU 24.75 25.16 24.41 24.90 24.5884 450000 True
2017-11-09 TOU 25.36 27.26 25.30 26.83 26.4942 2347500 False
2017-11-10 TOU 26.70 27.01 26.45 26.81 26.4745 903600 False
2017-11-13 TOU 26.76 26.85 26.10 26.40 26.0696 733200 False
2017-11-14 TOU 26.30 26.41 25.37 25.48 25.1611 619300 False
2017-11-15 TOU 25.22 25.27 24.72 24.74 24.4304 525800 False
2017-11-16 TOU 24.69 24.90 24.33 24.34 24.0354 516000 True
2017-11-17 TOU 24.67 24.86 23.98 24.00 23.6997 1233100 True
2017-11-20 TOU 24.01 24.03 23.68 23.70 23.4034 977800 True
2017-11-21 TOU 23.86 23.98 23.35 23.46 23.1664 544300 True
2018-09-21 TOU 21.11 21.30 20.91 20.99 20.9900 1235800 True
2018-09-24 TOU 21.19 21.72 21.19 21.66 21.6600 995800 False
2018-09-25 TOU 21.83 21.83 21.45 21.45 21.4500 574100 False
2018-09-26 TOU 21.38 21.65 20.88 20.97 20.9700 791600 True
2018-09-27 TOU 21.36 22.69 21.23 22.67 22.6700 1192500 False
2018-09-28 TOU 22.58 23.27 22.29 22.74 22.7400 1376300 False
2018-10-01 TOU 23.15 23.86 22.75 23.01 23.0100 1137200 False
2018-10-02 TOU 23.05 23.05 22.51 22.59 22.5900 801600 False
2018-10-03 TOU 22.65 23.59 22.43 23.52 23.5200 1391100 False
2018-10-04 TOU 23.35 23.35 22.39 22.47 22.4700 1272900 False
2018-10-05 TOU 22.62 22.66 22.19 22.62 22.6200 668300 False
2018-10-09 TOU 22.70 23.44 22.53 23.41 23.4100 832800 False
2018-10-10 TOU 23.38 23.38 22.27 22.30 22.3000 1435300 False
2018-10-11 TOU 21.84 22.08 21.16 21.28 21.2800 1127700 False
2018-10-12 TOU 21.78 21.80 21.12 21.18 21.1800 887300 True
2018-10-15 TOU 21.32 21.42 20.58 20.68 20.6800 852300 True
2018-10-16 TOU 20.80 20.80 20.34 20.44 20.4400 1115200 True
2018-10-17 TOU 20.38 20.48 20.03 20.09 20.0900 700900 True
2018-10-18 TOU 20.00 20.01 19.32 19.50 19.5000 1188600 True
2018-10-19 TOU 19.59 20.15 19.57 19.94 19.9400 1321600 True
2018-10-22 TOU 19.96 20.08 19.73 19.80 19.8000 828200 True'''
df = pd.read_csv(StringIO(s), sep='\t', index_col=0)
限制增量列中第一个False
和最后一个True
之间的数据,因为我的False
必须夹在df1 = df[df[df.delta==False].index.min():df[df.delta==False].index.max()]
值之间。
delta
将df1['deltaInt'] = df1.delta.astype('int')
df1.deltaInt = 1 - df1.deltaInt
列转换为整数并翻转其值。其原因将在下面变得很清楚:
df1['cumSum'] = df1.deltaInt.cumsum()
添加累计总和列;
df1 = df1.iloc[1:,:]
删除第一行;
df2 = df1.groupby('cumSum').apply(lambda x:x[x.low==x.low.min()])
通过累计进行汇总,并选择低列中的最小值;
df2 = df2[df2.delta]
print(df2)
仅选择delta为true的行,我们可以得到所需的结果;
getPosts(){
return this.http.get('https://jsonplaceholder.typicode.com/posts').map(res =>res.json());
}
输出:
当然,您可以删除两个额外的列。希望这会有所帮助,这不是经过优化的代码,而是我想到的第一件事。