如何基于熊猫中的两个列值对一行和上一行进行切片?

时间:2019-08-26 22:44:29

标签: python pandas

我已将月份列中的值之间的差异用于创建差异列。

data_2019['difference'] = data_2019.groupby('propertyId')['month'].diff()

current state

现在我要执行以下操作:

对于在差异列中具有1的每一行,只要propertyId值与前一行相同,就将该行和前一行保持不变。

desired state

2 个答案:

答案 0 :(得分:0)

这是您可以完成此操作的一种方法:

# True for the second row of two consecutive rows
data_2019['difference+'] = data_2019.groupby('propertyId')['month'].diff()==1

 # True for the first row of two consecutive rows
data_2019['differenc-'] = data_2019.groupby('propertyId')['month'].diff(periods=-1)==-1

# 'keep' is True if a row is the first or the second or both
data_2019['keep'] = data_2019['difference+'] | data_2019['difference-']


Out:

    propertyId  month   occ     difference+ difference- keep
0   a111        3       80.0    False       False       False
1   a111        5       93.0    False       True        True
2   a111        6       94.0    True        True        True
3   a111        7       95.5    True        False       True
4   a111        10      88.0    False       False       False
5   b111        2       97.0    False       True        True
6   b111        3       99.0    True        False       True
7   c116        2       97.0    False       False       False

然后您可以将行保留在data_2019['keep']==True

data_2019 = data_2019[data_2019['keep']==True]

答案 1 :(得分:0)

您可以尝试以下方法。如果它不起作用,请告诉我


df['new_diff'] = df['difference'].shift(-1)
df['new_propertyid'] = df['propertyid'].shift(-1)

mask = ( df['difference']==1) | ((df['new_diff']==1) & df['new_propertyid']==df['propertyid'])

ans = df[mask]