如何正确获取pandas中的单个单元格:loc [index,column] VS get_value(index,column)

时间:2016-11-10 09:30:53

标签: python pandas dataframe

使用哪种方法(在性能和可靠性方面)从pandas DataFrame获取单个单元格更好:get_value()或loc []?

1 个答案:

答案 0 :(得分:4)

您最终可以在docs中找到信息:

  

明确获取值(相当于弃用的df.get_value('a','A'))

# this is also equivalent to ``df1.at['a','A']``
In [55]: df1.loc['a', 'A']
Out[55]: 0.13200317033032932

但是如果使用它就没有警告。

但如果检查Index.get_value

  

从1维ndarray快速查找值。 只有在知道自己正在做什么的情况下才能使用

所以我认为更好的方法是使用iatatlocix

<强>计时

df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3]})

print (df)

In [93]: %timeit (df.loc[0, 'A'])
The slowest run took 6.40 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 177 µs per loop

In [96]: %timeit (df.at[0, 'A'])
The slowest run took 17.01 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.61 µs per loop

In [94]: %timeit (df.get_value(0, 'A'))
The slowest run took 23.49 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.36 µs per loop