我正在尝试开发一种基于两个条件查找行的算法,并为这些行分配一个值。
基本上我希望找到列是一个值的行,并且根据日期时间列从另一个值中找到大于指定timedelta的行
在下面一个非常简单的示例中,找到值为Dog且距离Cat大于1分钟的列,并为其赋予新值Bear
DateTime Value New Column
2015-10-25 00:00 Dog Bear
2015-10-25 00:01 Dog Dog
2015-10-25 00:02 Cat Cat
2015-10-25 00:03 Dog Bear
我可以很容易地完成第一部分,但完全失去了如何使用时间delta acpect
timefilter = datetime.timedelta(minutes=1)
df.loc[df['Value'] == 'Dog'] = 'Bear'
答案 0 :(得分:0)
@ user2956554 - 我认为您正在寻找的关键组件是pandas的.shift()
功能。它将帮助您获得微小的差异。这是我放在一起的快速代码:
import pandas as pd
df = pd.DataFrame({'time':['1/3/2015 00:00','1/3/2015 00:01','1/3/2015 00:02','1/3/2015 00:03'], 'value':['Dog','Dog','Cat','Dog']})
df['time'] = pd.to_datetime(df['time'], infer_datetime_format = True)
df['delta'] = pd.to_datetime(df['time'] - df['time'].shift(1)).dt.minute
df['new_column'] = df['value']
df['new_column'][(df['delta'] >= 1) & (df['value'] == 'Dog') & (df['value'].shift(1) == 'Cat')] = 'Bear'
df.drop('delta', axis=1, inplace=True) # if you want to get rid of the 'delta' column
注意事项: pandas根据索引样式向我的代码中的倒数第二行发出警告,但代码有效。它背后的警告和原因略高于我的工资等级,但如果您想了解更多信息,请访问以下链接:http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy。
这是输出(包括警告):
C:\Program Files\Sublime Text 3\pandas_time_test.py:9: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
df['new_column'][(df['delta'] >= 1) & (df['value'] == 'Dog') & (df['value'].shift(1) == 'Cat')] = 'Bear'
time value new_column
0 2015-01-03 00:00:00 Dog Dog
1 2015-01-03 00:01:00 Dog Dog
2 2015-01-03 00:02:00 Cat Cat
3 2015-01-03 00:03:00 Dog Bear