数据透视表的结构:
Crypto Exchange ABUCOINS BINANCE ... ZAIF Average StdDeviation
Transaction time_period_start
01:00:00 PM 13762.75 NaN ... NaN 13482.51 700.253
02:00:00 AM 13563.53 NaN NaN 13591.80 782.476
BTC_USD .
.
12:00:00 PM 13630.80 NaN ... NaN 13595.96 497.21
.
.
.
ZYD_USD 01:00:00 AM NaN 0.045 ... NaN 0.032 0.02
.
12:00:00 PM
数据框包含23个Crypto Exchange平台上的5000多个事务。数据透视表由Transaction
和time_period_start
跨列Crypto Exchange
要识别异常值,我想标记所有价格小于(或大于)该行的StdDeviation
值的列的单个价格。
我查看了已经存在的不同问题,并提出了以下功能:
def findOutlier(x):
for i in range(len(x)-1):
if x[i] is not None and abs(x[i] - x[-2]) > x[-1]:
x[i] = True
return x[0]
然后
df_temp = df_with_sd.apply(findOutlier, axis=1)
问题在于这种方法:
+
符号?)如何获得所需的结果?