如果Pandas DataFrame列timestampMs
中的值属于unicode
类型,并且我们想将其转换为float
,那么以下两种方法之间是否存在差异?< / p>
df['timestampMs'].map(lambda x: float(x)/1000)
和
df['timestampMs'].astype('float')/1000
因为他们似乎都给出了相同的结果,这是首选的方法吗?
答案 0 :(得分:2)
嗯......如果你关心速度,对于小型数据集,lambda方法要快一点。对于大型数据集,请使用.astype()
方法(我个人觉得它更具可读性):
import time
import timeit
import pandas as pd
num_elements = 100
times = [unicode(time.clock()) for x in range(num_elements)]
df = pd.DataFrame(times)
def first_method():
df[0].map(lambda x: float(x)/1000)
def second_method():
df[0].astype('float')/1000
num_reps = 15000
print("First method time for {} reps: {}".format(num_reps, timeit.timeit(first_method, number=num_reps)))
print("Second method time for {} reps: {}".format(num_reps, timeit.timeit(second_method, number=num_reps)))
我得到num_elements = 100
时:
First method time for 15000 reps: 1.95685731342
Second method time for 15000 reps: 2.22381265566
我得到num_elements = 1000
时:
First method time for 15000 reps: 12.0774245498
Second method time for 15000 reps: 6.77670391568