Question

让我们假设一个以日期时间为索引的数据框，其中有一个名为“ Score”的列，初始设置为10：

            score
2016-01-01  10
2016-01-02  10
2016-01-03  10
2016-01-04  10
2016-01-05  10
2016-01-06  10
2016-01-07  10
2016-01-08  10

我想从分数中减去一个固定值（比方说1），但是仅当索引在某些日期之间（例如3日至6日之间）时：

            score
2016-01-01  10
2016-01-02  10
2016-01-03  9
2016-01-04  9
2016-01-05  9
2016-01-06  9
2016-01-07  10
2016-01-08  10

由于我的实际数据帧很大，并且我将针对不同的日期范围和每个日期范围使用不同的固定值N进行此操作，因此我想实现这一目标而无需为每个日期创建新的列设置为-N情况。

类似numpy的where函数，但是在一定范围内，如果满足条件，允许我将当前值相加/相减，否则不做任何事情。有这样的东西吗？

Answer 1

使用索引切片：

df.loc['2016-01-03':'2016-01-06', 'score'] -= 1

Answer 2

我会使用查询做类似的事情：

import pandas as pd

df = pd.DataFrame({"score":pd.np.random.randint(1,10,100)}, 
    index=pd.date_range(start="2018-01-01", periods=100))

start = "2018-01-05"
stop = "2018-04-08"

df.query('@start <= index <= @stop ') - 1

编辑：请注意，可以使用以eval表示的布尔值，但是可以以不同的方式使用，因为熊猫where会作用于False值。

df.where(~df.eval('@start <= index <=  @stop '), 
         df['score'] - 1, axis=0, inplace=True)

了解如何反转比较运算符（使用~）以获取所需的内容。这是有效的，但还不是很清楚。当然，您也可以使用pd.np.where，这一切都很好。

Answer 3

假设日期为datetime dtype：

#if date is own column:    
df.loc[df['date'].dt.day.between(3,6), 'score'] = df['score'] - 1

#if date is index:    
df.loc[df.index.day.isin(range(3,7)), 'score'] = df['score'] - 1

如果日期在2个日期之间，Python Pandas会在列中求和一个恒定值

3 个答案: