我想找到一列连续两个行的百分比差异,如果该差异大于10%,我想返回第一个值。 例如,在下面的数据中,我想找到df.close [0]和df.close [1]之间的百分比差,如果该差大于10,那么我想将df.close [0]的值设置为df.close [1]如果差异小于10,那么我想为df.close [0]和df.close [1]保留相同的值,怎么做=?
1. open 2. high 3. low 4. close 5. volume
date
2000-01-03 41.7917 42.5000 40.8333 41.2500 2006460.0
2000-01-04 41.0833 41.0833 38.2500 39.2917 3392856.0
2000-01-05 37.2083 37.2083 34.0000 34.5500 4344624.0
2000-01-06 34.5000 36.3333 34.5000 35.6708 2219904.0
2000-01-07 39.1667 43.2500 38.6667 43.2500 7155936.0
我尝试了以下代码,但似乎不起作用:
def percentage_diff(x):
if (abs((x[0]-x[1]/x[0])*100)>10):
return x[0]
else:
return x[1]
df.close = pd.rolling_apply(df['close'], 2, percentage_diff)
答案 0 :(得分:0)
对于两个值x[0]
之间的百分比差异小于10%的情况,似乎您想用x[1]
替换(x[0]-x[1])/x[0])*100
的值。不清楚是要返回x还是仅返回x的元素。
def percentage_diff(x):
if (abs((x[0]-x[1])/x[0])*100) > 10:
return x #or return x[0] if that is what you really want.
else:
x[0] = x[1]
return x #or return x[1] if that is what you really want.
print(percentage_diff([1,1,3,4,54,9])) #the percentage difference between 1 and 1 is less than 10%
print(percentage_diff([1,2,3,4,54,9])) #the percentage difference between 1 and 2 is more than 10%
以下是上面代码的输出:
>>> [1, 1, 3, 4, 54, 9]
>>> [1, 2, 3, 4, 54, 9]
要将功能应用于pandas DataFrame
,您可以这样做:
df['close'] = df.close.apply(percentage_diff)
答案 1 :(得分:0)
通过使用以下功能,我能够解决此问题。
def percentage_diff(x):
per = (abs((x[0] - x[1]))/x[1] *100)
if (per > 30):
return min(x[0], x[1])
else:
return x[0]
在我最初的问题中,如果百分比差异大于10,我将返回x [0]或x [1],这只是将值移到了下一行,而并没有真正消除该异常。
def percentage_diff(x):
if (abs((x[0]-x[1]/x[0])*100)>10):
return x[0]
else:
return x[1]
df.close = pd.rolling_apply(df['close'], 2, percentage_diff)