我有一个时间序列的每月温度异常数据是60年。我只想传递温度序列中温度异常大于0.5的连续六个月或更长时间的温度值。尽管我发现用NaN替换<0.5的值很容易,但是我不确定如何替换温度> 0.5的值,但是只有2或3个连续的值大于0.5。下面的代码段:
time = [1950.04167, 1950.125 , 1950.20833, 1950.29167, 1950.375 ,
1950.45833, 1950.54167, 1950.625 , 1950.70833, 1950.79167,
1950.875 , 1950.95833, 1951.04167, 1951.125 , 1951.20833,
1951.29167, 1951.375 , 1951.45833, 1951.54167, 1951.625 ,
1951.70833, 1951.79167, 1951.875 , 1951.95833, 1952.04167,
1952.125 , 1952.20833, 1952.29167, 1952.375 , 1952.45833,
1952.54167, 1952.625 , 1952.70833, 1952.79167, 1952.875 ,
1952.95833, 1953.04167, 1953.125 , 1953.20833, 1953.29167,
1953.375 , 1953.45833, 1953.54167, 1953.625 , 1953.70833,
1953.79167, 1953.875 , 1953.95833, 1954.04167, 1954.125 ,
1954.20833, 1954.29167, 1954.375 , 1954.45833, 1954.54167,
1954.625 , 1954.70833, 1954.79167, 1954.875 , 1954.95833]
sst = [-1.67623 , -1.685853, -1.69083 , -1.61898 , -1.40235 ,
-1.097773, -0.835867, -0.718727, -0.694087, -0.785423,
-0.9312 , -1.01925 , -0.8868 , -0.48022 , -0.007597,
0.448647, 0.66546 , 0.852427, 0.922443, 1.14481 ,
1.291153, 1.338903, 0.993053, 0.68006, 0.493597,
0.500197, 0.528363, 0.515583, 0.418493, 0.168387,
-0.003403, 0.033933, 0.15759 , 0.113847, 0.019967,
0.111413, 0.372967, 0.623067, 0.763903, 0.909743,
0.990287, 1.01288 , 0.969407, 0.985817, 0.982607,
1.01244 , 1.039917, 1.11755, 1.044333, 0.799593,
0.3769 , 0.105033, -0.070743, -0.281483, -0.59861,
-0.875743, -0.88768 , -0.642517, -0.548043, -0.547057]
series = pd.Series(index=time,data=sst)
greater = series.where(cond=(series>= 0.5))
因此,例如,我希望能够“传递”与1951.375至1951.95833和1953.125至1954.125时间跨度相对应的SST值,其中对于8个和13个连续值,SST分别大于0.5,但是用NaN替换SST值,以获取对应于1952.125至1952.29167的SST值,其中只有3个连续的值> 0.5。
有什么建议吗? TIA!
答案 0 :(得分:0)
您可以使用> 0.5
找到series.groupby(series.le(0.5).cumsum())
游程的长度,然后使用.apply()
将值替换为过短的游程。
.groupby
最终将最后一个<= 0.5
值汇总在一起,因此我们希望将其限制为大于等于5的整数,并用np.nan
替换第一个值。
In [61]: (
series
.groupby(series.le(0.5).cumsum())
.apply(lambda x: pd.Series(np.nan if len(x) < 5 else [np.nan] + list(x)[1:], x.index))
)
Out[61]:
1950.04167 NaN
1950.12500 NaN
1950.20833 NaN
1950.29167 NaN
1950.37500 NaN
1950.45833 NaN
1950.54167 NaN
1950.62500 NaN
1950.70833 NaN
1950.79167 NaN
1950.87500 NaN
1950.95833 NaN
1951.04167 NaN
1951.12500 NaN
1951.20833 NaN
1951.29167 NaN
1951.37500 0.665460
1951.45833 0.852427
1951.54167 0.922443
1951.62500 1.144810
1951.70833 1.291153
1951.79167 1.338903
1951.87500 0.993053
1951.95833 0.680060
1952.04167 NaN
1952.12500 NaN
1952.20833 NaN
1952.29167 NaN
1952.37500 NaN
1952.45833 NaN
1952.54167 NaN
1952.62500 NaN
1952.70833 NaN
1952.79167 NaN
1952.87500 NaN
1952.95833 NaN
1953.04167 NaN
1953.12500 0.623067
1953.20833 0.763903
1953.29167 0.909743
1953.37500 0.990287
1953.45833 1.012880
1953.54167 0.969407
1953.62500 0.985817
1953.70833 0.982607
1953.79167 1.012440
1953.87500 1.039917
1953.95833 1.117550
1954.04167 1.044333
1954.12500 0.799593
1954.20833 NaN
1954.29167 NaN
1954.37500 NaN
1954.45833 NaN
1954.54167 NaN
1954.62500 NaN
1954.70833 NaN
1954.79167 NaN
1954.87500 NaN
1954.95833 NaN
dtype: float64