使用pandas问题进行Unix时间戳转换

时间:2016-04-04 14:50:50

标签: python pandas lambda

我有一个pandas数据框df,如下所示:

       _sent_time_stamp  distance  duration  duration_in_traffic   Orig_lat  
0            1456732800      1670       208                  343  51.441092

我想将纪元时间值(_sent_time_stamp)转换为两列,一列是日期,另一列是小时。

我定义了两个函数:

def date_convert(time):
    return time.date()

def hour_convert(time):
    return time.hour()

然后我使用lambda演算来应用这些函数并创建2个新列。

df['date'] = Goo_results.apply(lambda row: date_convert(pd.to_datetime(row['_sent_time_stamp'], unit='s')), axis=1)

df['hour'] = Goo_results.apply(lambda row: hour_convert(pd.to_datetime(row['_sent_time_stamp'], unit='s')), axis=1)

日期栏有效但小时不起作用。我不明白为什么!

TypeError: ("'int' object is not callable", u'occurred at index 0')

1 个答案:

答案 0 :(得分:1)

您可以删除()下一个hour

def date_convert(time):
    return time.date()

def hour_convert(time):
    return time.hour #remove ()

df['date'] = df.apply(lambda row: date_convert(pd.to_datetime(row['_sent_time_stamp'], unit='s')), axis=1)
df['hour'] = df.apply(lambda row: hour_convert(pd.to_datetime(row['_sent_time_stamp'], unit='s')), axis=1)    
print df
   _sent_time_stamp  distance  duration  duration_in_traffic   Orig_lat  \
0        1456732800      1670       208                  343  51.441092   

         date  hour  
0  2016-02-29     8  

但使用dt.datedt.hour

更好更快
dat = pd.to_datetime(df['_sent_time_stamp'], unit='s')
df['date'] = dat.dt.date
df['hour'] = dat.dt.hour
print df
   _sent_time_stamp  distance  duration  duration_in_traffic   Orig_lat  \
0        1456732800      1670       208                  343  51.441092   

         date  hour  
0  2016-02-29     8  

<强>计时

In [20]: %timeit new(df1)
1000 loops, best of 3: 827 µs per loop

In [21]: %timeit lamb(df)
The slowest run took 4.40 times longer than the fastest. This could mean that an intermediate result is being cached 
1000 loops, best of 3: 1.13 ms per loop

代码:

df1 = df.copy()

def date_convert(time):
    return time.date()

def hour_convert(time):
    return time.hour


def lamb(df):    
    df['date'] = df.apply(lambda row: date_convert(pd.to_datetime(row['_sent_time_stamp'], unit='s')), axis=1)
    df['hour'] = df.apply(lambda row: hour_convert(pd.to_datetime(row['_sent_time_stamp'], unit='s')), axis=1)    
    return df

def new(df): 
    dat = pd.to_datetime(df['_sent_time_stamp'], unit='s')
    df['date'] = dat.dt.date
    df['hour'] = dat.dt.hour
    return df

print lamb(df)    
print new(df1)