将熊猫系列转换为单调

时间:2020-08-28 17:34:37

标签: python pandas series

我正在寻找一种方法来消除破坏系列单调性的点。

例如

s = pd.Series([0,1,2,3,10,4,5,6])

s = pd.Series([0,1,2,3,-1,4,5,6])

我们将提取

s = pd.Series([0,1,2,3,4,5,6])

NB:我们认为第一个元素总是正确的。

2 个答案:

答案 0 :(得分:0)

单调性可能增加或减少,下面的函数将返回不包含所有单调性的值。

但是,鉴于系列s = pd.Series([0,1,2,3,10,4,5,6])10不会打破单调性条件,4, 5, 6确实会使您的问题困惑。因此正确的答案是0, 1, 2, 3, 10

import pandas as pd

s = pd.Series([0,1,2,3,10,4,5,6])

def to_monotonic_inc(s):
    return s[s >= s.cummax()]

def to_monotonic_dec(s):
    return s[s <= s.cummin()]

print(to_monotonic_inc(s))
print(to_monotonic_dec(s))

输出为0, 1, 2, 3, 10用于增加,0用于减少。

也许您想找到最长单调数组?因为那是完全不同的搜索问题。

-----编辑-----

以下是在使用普通python的约束条件下查找最长单调递增数组的简单方法:

def get_longeset_monotonic_asc(s):
    enumerated = sorted([(v, i) for i, v in enumerate(s) if v >= s[0]])[1:]
    output = [s[0]]
    last_index = 0
    for v, i in enumerated:
        if i > last_index:
            last_index = i
            output.append(v)

    return output

s1 = [0,1,2,3,10,4,5,6]
s2 = [0,1,2,3,-1,4,5,6]

print(get_longeset_monotonic_asc(s1))
print(get_longeset_monotonic_asc(s2))

'''
Output:

[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6]

'''

请注意,此解决方案涉及排序为 O(nlog(n))的第二步,以及排序为 O(n)的第二步。

答案 1 :(得分:0)

这是产生单调递增序列的一种方法:

import pandas as pd

# create data
s = pd.Series([1, 2, 3, 4, 5, 4, 3, 2, 3, 4, 5, 6, 7, 8])

# find max so far (i.e., running_max)
df = pd.concat([s.rename('orig'), 
                s.cummax().rename('running_max'),
               ], axis=1)

# are we at or above max so far?
df['keep?'] = (df['orig'] >= df['running_max'])

# filter out one or many points below max so far
df = df.loc[ df['keep?'], 'orig']

# verify that remaining points are monotonically increasing
assert pd.Index(df).is_monotonic_increasing

# print(df.drop_duplicates()) # eliminates ties
print(df)                     # keeps ties

0     1
1     2
2     3
3     4
4     5
10    5 # <-- same as previous value -- a tie
11    6
12    7
13    8
Name: orig, dtype: int64

您可以使用s.plot();df.plot();来图形查看