函数将不适用于pandas数据框,导致语法错误

时间:2015-11-20 18:21:51

标签: python pandas dataframe

我正在尝试将此功能应用于pandas数据框,以便查看出租车接送或下降时间是否在我使用arrimin创建的范围内,到达下面的最大变量。

如果时间确实属于范围,我想保留行。如果它超出范围我想从数据帧中删除它。

Start.Time,End.Time等都是日期时间对象,因此时间功能应该可以正常工作。

def time_function(df, row):
    gametimestart = df['Start.Time'] 
    gametimeend = df['End.Time'] 
    arrivemin = gametimestart - datetime.timedelta(minutes=120) 
    arrivemax = gametimeend - datetime.timedelta(minutes = 30) 
    departmin = gametimeend - datetime.timedelta(minutes = 60) 
    departmax = gametimeend + datetime.timedelta(minutes = 90)
    for not i in ((df['pickup_datetime'] > arrivemin) & (df['pickupdatetime'] < arrivemax) &(df['dropoff_datetime'] > departmin) & (df['dropoffdatetime'] < departmax)):
        df = df.drop[df[i.index]]
    return


for index, row in yankdf:
    time_function(yankdf, row)

继续收到此语法错误:

 File "<ipython-input-25-bda6fb2db429>", line 17
    for not i in (((row['pickup_datetime'] > arrivemin) & (row['pickupdatetime'] < arrivemax)) | ((row['dropoff_datetime'] > departmin) & (row['dropoffdatetime'] < departmax)):
          ^
SyntaxError: invalid syntax

1 个答案:

答案 0 :(得分:1)

我认为你不需要这个功能。只需执行一个基本子集,df_filtered就应该是过滤后的数据帧。

gametimestart = df['Start.Time'] 
gametimeend = df['End.Time'] 
arrivemin = gametimestart - datetime.timedelta(minutes=120) 
arrivemax = gametimeend - datetime.timedelta(minutes = 30) 
departmin = gametimeend - datetime.timedelta(minutes = 60) 
departmax = gametimeend + datetime.timedelta(minutes = 90)
df_filtered = df[(df['pickup_datetime'] > arrivemin) &
                 (df['pickup_datetime'] < arrivemax) &
                 (df['dropoff_datetime'] > departmin) & 
                 (df['dropoffdatetime'] < departmax)]