在dataframe行中查找值 - 创建突出显示下一行匹配的新列

时间:2016-07-07 07:18:09

标签: python pandas

我正在尝试在Pandas数据帧行中查找值,并创建一个新列,突出显示下一行是否匹配。因此,对于以下示例:

rng = pd.DataFrame( {'test_1': ['A', 'A','A', 'A', 'B','B', 'A' , 'A', 'A', 'A','A' , 'A', 'A', 'A',]},  index = pd.date_range('4/2/2014', periods=14, freq='BH'))
reg

在2014-04-02 13:00:00和2014-04-02 14:00:00的行== B所以有匹配:

    test_1
2014-04-02 09:00:00 A
2014-04-02 10:00:00 A
2014-04-02 11:00:00 A
2014-04-02 12:00:00 A
2014-04-02 13:00:00 B
2014-04-02 14:00:00 B
2014-04-02 15:00:00 A
2014-04-02 16:00:00 A
2014-04-03 09:00:00 A
2014-04-03 10:00:00 A
2014-04-03 11:00:00 C
2014-04-03 12:00:00 A
2014-04-03 13:00:00 D
2014-04-03 14:00:00 D

所以新栏目应该如下:

B_Matches
    2014-04-02 09:00:00 0
    2014-04-02 10:00:00 0
    2014-04-02 11:00:00 0
    2014-04-02 12:00:00 0
    2014-04-02 13:00:00 0
    2014-04-02 14:00:00 1
    2014-04-02 15:00:00 0
    2014-04-02 16:00:00 0
    2014-04-03 09:00:00 0
    2014-04-03 10:00:00 0
    2014-04-03 11:00:00 0
    2014-04-03 12:00:00 0
    2014-04-03 13:00:00 0
    2014-04-03 14:00:00 0

我将在其他列中为C,D等做同样的事情。我基本上试图找到某个条件的时间,并且下一个时段是相同的,我将在此列上执行count()以查看下一个时段匹配的频率。还请显示其他任何方法。

感谢您的帮助。

1 个答案:

答案 0 :(得分:2)

你可以定义一个带有你的值的函数,并返回任何行是否符合你的条件,这适用于你传递的任何值,然后将布尔序列转换为int,以便转换True分别为False10

In [220]:
def func(val):
    return ((rng['test_1'] == val) & (rng['test_1'].shift() == val)).astype(int)
​
func('B')

Out[220]:
2014-04-02 09:00:00    0
2014-04-02 10:00:00    0
2014-04-02 11:00:00    0
2014-04-02 12:00:00    0
2014-04-02 13:00:00    0
2014-04-02 14:00:00    1
2014-04-02 15:00:00    0
2014-04-02 16:00:00    0
2014-04-03 09:00:00    0
2014-04-03 10:00:00    0
2014-04-03 11:00:00    0
2014-04-03 12:00:00    0
2014-04-03 13:00:00    0
2014-04-03 14:00:00    0
Freq: BH, Name: test_1, dtype: int32

In [222]:
func('A')

Out[222]:
2014-04-02 09:00:00    0
2014-04-02 10:00:00    1
2014-04-02 11:00:00    1
2014-04-02 12:00:00    1
2014-04-02 13:00:00    0
2014-04-02 14:00:00    0
2014-04-02 15:00:00    0
2014-04-02 16:00:00    1
2014-04-03 09:00:00    1
2014-04-03 10:00:00    1
2014-04-03 11:00:00    1
2014-04-03 12:00:00    1
2014-04-03 13:00:00    1
2014-04-03 14:00:00    1
Freq: BH, Name: test_1, dtype: int32