根据多个条件比较数据框的一行

时间:2019-12-16 10:28:56

标签: python pandas

我的数据框如下:

    price        trend  decreasing  increasing  condition_decreasing
    0   7610.4  no trend    0.0     False       
    1   7610.4  no trend    0.0     False       
    2   7610.4  no trend    0.0     False       
    3   7610.4  decreasing  7610.4  True        
    4   7610.4  decreasing  7610.4  False       
    5   7610.4  decreasing  7610.4  False       
    6   7610.4  decreasing  7610.4  False       
    7   7610.4  decreasing  7610.4  False       
    8   7610.4  decreasing  7610.4  False       
    9   7610.4  decreasing  7610.4  False       
    10  7610.3  no trend    0.0     True        
    11  7610.3  no trend    0.0     False       
    12  7613.9  no trend    0.0     False   
    13  7613.9  no trend    0.0     False
    14  7613.4  no trend    0.0     False
    15  7613    decreasing  7613    True    
    16  7612    decreasing  7612    False 
    17  7612    decreasing  7612    False
    18  7612    no trend    7612    True

我基本上需要做的是,当列trendno trenddecreasing时,要从列price中获取该值并将其与该值进行比较当列趋势从price变为decreasing时,列no trend的值。因此,在上面的示例中,将比较第3行的值7610.4与第10行的值7610.3。

我尝试使用以下代码添加一列,该列指示列趋势何时发生变化: condition_decreasing = (data['trend'] != data['trend'].shift(1))

但是在我不知道如何循环访问数据帧并比较两个价格之后……有什么想法吗?谢谢您的帮助!

预期输出将是这样的数据帧:

price    trend  decreasing  increasing  condition_decreasing output
0   7610.4  no trend    0.0     False       
1   7610.4  no trend    0.0     False       
2   7610.4  no trend    0.0     False       
3   7610.4  decreasing  7610.4  True        
4   7610.4  decreasing  7610.4  False       
5   7610.4  decreasing  7610.4  False       
6   7610.4  decreasing  7610.4  False       
7   7610.4  decreasing  7610.4  False       
8   7610.4  decreasing  7610.4  False       
9   7610.4  decreasing  7610.4  False       
10  7610.3  no trend    0.0     True        -0.1
11  7610.3  no trend    0.0     False       
12  7613.9  no trend    0.0     False 
13  7613.9  no trend    0.0     False
14  7613.4  no trend    0.0     False
15  7613    decreasing  7613    True    
16  7612    decreasing  7612    False 
17  7612    decreasing  7612    False
18  7612    no trend    7612    True        -1

因此基本上是一列,其中包含两个价格的差异7610.3 - 7610.4

2 个答案:

答案 0 :(得分:1)

我们可以在计算出差异后使用DataFrame.reindex

m=data['trend'].ne( data['trend'].shift()
                                 .fillna(data['trend']) )
data['output']=( data.loc[m,'price'].diff()
                     .reindex(data.index)
                     .where(data['trend'].eq('no trend')) )
                    #.where(data['trend'].ne('decreasing'))  )
                    #.where(data['trend'].str.replace(' ','').eq('notrend')) )

print(data)

# m is your condition_decreasing column

#data['output']=( data.loc[data['condition_decreasing'],'price']
#                     .diff()
#                     .reindex(data.index)
#                     .where(data['trend'].eq('no trend')) )

输出

     price            trend  decreasing_increasing  output
0   7610.4         no trend                    0.0     NaN
1   7610.4         no trend                    0.0     NaN
2   7610.4         no trend                    0.0     NaN
3   7610.4       decreasing                 7610.4     NaN
4   7610.4       decreasing                 7610.4     NaN
5   7610.4       decreasing                 7610.4     NaN
6   7610.4       decreasing                 7610.4     NaN
7   7610.4       decreasing                 7610.4     NaN
8   7610.4       decreasing                 7610.4     NaN
9   7610.4       decreasing                 7610.4     NaN
10  7610.3         no trend                    0.0    -0.1
11  7610.3         no trend                    0.0     NaN
12  7613.9         no trend                    0.0     NaN

答案 1 :(得分:0)

也许您想这样做?

import pandas as pd
import numpy as np

price = [7610.3, 7610.3, 7610.4, 7610.4, 7610.4, 7610.4, 7610.3, 7610.3, 7610.9]

df = pd.DataFrame({'price': price})
df['diff'] = df['price'].diff()
conditions = [
    (df['diff'] == 0),
    (df['diff'] > 0),
    (df['diff'] < 0)]
choices = ['no trend', 'increasing', 'decreasing']
df['trend'] = np.select(conditions, choices, default = None)
print(df)

    price  diff       trend
0  7610.3   NaN        None
1  7610.3   0.0    no trend
2  7610.4   0.1  increasing
3  7610.4   0.0    no trend
4  7610.4   0.0    no trend
5  7610.4   0.0    no trend
6  7610.3  -0.1  decreasing
7  7610.3   0.0    no trend
8  7610.9   0.6  increasing