Question

至少有6种方法可以检索大熊猫DataFrame中的元素：.iloc，.loc，.iat，.at，.ix ，并直接使用[]运算符。

他们之间有什么区别？他们如何处理丢失的标签/超出范围的位置？它们与Series查找有何不同？

Answer 1

通常，您可以使用Series（.loc，.iloc，.ix和[]运算符）和{{1}的相同方法 - 具体的DataFrame和.iat来检索.at中的元素，语法为：

DataFrame

在大多数情况下，新对象将是新newobj = df.method_choosen[row,column]，在某些情况下是DataFrame，在其他情况下，它只是标量值。

一般规则是Series和.iloc[]按位置选择，.iat[]和.loc[]按标签选择，第一个参数是行（最终使用.at[]）和第二个（可选）参数的所有行都是列。

但请记住：

与始终首先选择行的:，.ix和.iloc不同，.loc会在给定单个元素或turple时选择列（并且它不会接受第二个参数）。但是，当传递切片时，它会选择行！
使用单个元素选择具有[]运算符的数据帧始终是标签，而不是从系列中选择，如果将整数传递给字符串索引系列，则将其解释为基于位置的查找。
[]和.at只接受单个列和行值，如果传递切片或turple，则会出错。

还有其他一些与系列查询“共享”的不一致之处：

对于系列，当切片被提供给基于位置的查找时，最后一个元素被排除，但是当它是基于标签的查找时，包含最后一个元素。
返回值可以是（a）.iat，只有列标题而没有行，（b）DataFrame，请求的数据显示为DataFrame或（c）其中一个例外NaN，KeyError，IndexError，ValueError。

这是完整的表（几乎）所有可能性（使用pandas 0.17.1，NumPy 1.10.4，Python 3.4.3）。另请参阅equivalent question for Series。

案例1：具有整数索引

的DataFrame

TypeError

案例2：具有字符串索引

的DataFrame

di = pd.DataFrame(np.arange(100,120).reshape(5,4),index=[10,11,12,13,14],columns=[20,21,22,23])
di
     20    21    22    23
10  100   101   102   103
11  104   105   106   107
12  108   109   110   111
13  112   113   114   115
14  116   117   118   119

** Single element **                                       ** Slice **                                                                             ** Tuple **
di[20]        -> COL -> LAB -> {10:100,11:104,..,14:116}  di[0:1]             -> ROWS -> POS -> {10:{20:100,..,23:103}}                          di[[20,22]]            -> COLS -> LAB -> {10:{20:100,22:102},..,14:{20:116,22:118}}
di[3]         -> COL -> LAB -> KeyError                   di[200:202]         -> ROWS -> POS -> df with column headers                           di[[20]]               -> COLS -> LAB -> {10:{20:100},..,14:{20:116}} (df)
di[0,1]                     -> TypeError                  di[12:14]           -> ROWS -> POS -> df with column headers                           di[[0]]                -> COLS -> LAB -> KeyError
di[10,21]                   -> TypeError                  di[:,2]             -> ROWS -> POS -> TypeError                                        di[[10,11],[20,22]]                   -> TypeError
---                                                       ---                                                                                    ---
di.ix[0]      -> ROW -> LAB -> KeyError                   di.ix[0:1]          -> ROWS -> LAB -> df with column headers                           di.ix[[20,22]]         -> ROWS -> LAB -> {20:{20:NaN,..,23:NaN},22:{20:NaN,..,23:NaN}}
di.ix[13]     -> ROW -> LAB -> {20:112,..,23:115}}        di.ix[200:202]      -> ROWS -> LAB -> df with column headers                           di.ix[[11,14]]         -> ROWS -> LAB -> {11:{20:104,..,23:107},14:{20:116,..,23:119}}
di.ix[0,1]    -> R/C -> LAB -> KeyError                   di.ix[12:14]        -> ROWS -> LAB -> {12:{20:108,..,23:111},..,14:{20:116,..,23:119}} di.ix[[11]]            -> ROWS -> LAB -> {11:{20:104,..,23:107}}
di.ix[10,21]  -> R/C -> LAB -> 101                        di.ix[:,20]         -> ROW  -> LAB -> {10:100,..,14:116}                               di.ix[[0]]             -> ROWS -> LAB -> {20:{20:NaN,..,23:NaN}}
---                                                       di.ix[:,20:22]      -> R/C  -> LAB -> {10:{20:100,22:102},..,14:{20:116,..,22:118}}    di.ix[[10,11],[20,22]] -> R/C  -> LAB -> {10:{20:100,22:102},11:{20:104,22:106}}
di.iloc[0]    -> ROW -> POS -> {20:100,..,23:103}         di.ix[10:14,20:22]  -> R/C  -> LAB -> {10:{20:100,22:102},..,14:{20:116,..,22:118}}    di.ix[[16],[26]]       -> R/C  -> LAB -> {16:{26:NaN}}
di.iloc[13]   -> ROW -> POS -> IndexError                 di.ix[10,:]         -> COL  -> LAB -> {20:100,..,23:103}                               ---
di.iloc[0,1]  -> R/C -> POS -> 101                        ---                                                                                    di.iloc[[20,22]]       -> ROWS -> POS -> IndexError
---                                                       di.iloc[0:1]        -> ROWS -> POS -> {10:{20:100,..,23:103}}                          di.iloc[[11,14]]       -> ROWS -> POS -> IndexError
di.loc[0]     -> ROW -> LAB -> KeyError                   di.iloc[200:202]    -> ROWS -> POS -> df with column headers                           di.iloc[[11]]          -> ROWS -> POS -> IndexError
di.loc[13]    -> ROW -> LAB -> {20:112,..,23:115}         di.iloc[12:14]      -> ROWS -> POS -> df with column headers                           di.iloc[[0]]           -> ROWS -> POS -> {10:{20:100,..,23:103}}
di.loc[10,21] -> R/C -> LAB -> 101                        di.iloc[0:1,2:3]    -> R/C  -> POS -> {10:{22:102}}                                    di.iloc[[0],[1]]       -> R/C  -> POS -> {10:{21:101}}
---                                                       di.iloc[6:8,6:8]    -> R/C  -> POS -> Empty df                                         ---
di.iat[0]            -> POS -> TypeError                  ---                                                                                    di.loc[[20,22]]        -> ROWS -> LAB -> KeyError
di.iat[0,1]   -> R/C -> POS -> 101                        di.loc[0:1]         -> ROWS -> LAB -> df with column headers                           di.loc[[11,14]]        -> ROWS -> LAB -> {11:{20:104,..,23:107},14:{20:116,..,23:119}}
di.iat[0,15]  -> R/C -> POS -> IndexError                 di.loc[200:202]     -> ROWS -> LAB -> df with column headers                           di.loc[[11]]           -> ROWS -> LAB -> {11:{20:104,..,23:107}}
di.iat[15,0]  -> R/C -> POS -> IndexError                 di.loc[12:14]       -> ROWS -> LAB -> {12:{20:108,..,23:111},..,14:{20:116,..,23:119}} di.loc[[0]]            -> ROWS -> LAB -> KeyError
di.iat[15,15] -> R/C -> POS -> IndexError                 di.loc[10:11,20:21] -> R/C  -> LAB -> {10:{20:100,21:101},11:{20:104,21:105}           di.loc[[10],[21]]      -> R/C  -> POS -> {10:{21:101}}
---                                                       di.loc[16:18,26:28] -> R/C  -> LAB -> Empty df                                         ---
di.at['a']           -> LAB -> ValueError                 ---                                                                                    di.iat[[0,1],[2,3]]    -> R/C  -> POS -> ValueError
di.at[10,21]  -> R/C -> LAB -> 101                        di.iat[0:1,2:3]     -> R/C  -> POS -> ValueError                                       di.iat[[0],[2]]        -> R/C  -> POS -> ValueError
di.at[0,1]    -> R/C -> LAB -> KeyError                   di.iat[0:1,2]       -> R/C  -> POS -> ValueError                                       ---
di.at[0,21]   -> R/C -> LAB -> KeyError                   di.iat[:,2]         -> R/C  -> POS -> ValueError                                       di.at[[10,11],[22,23]] -> R/C  -> LAB -> ValueError
di.at[10,0]   -> R/C -> LAB -> KeyError                   di.iat[2,:]         -> R/C  -> POS -> ValueError                                       
                                                          ---
                                                          di.at[10:11,22:23]  -> R/C  -> POS -> ValueError
                                                          di.at[0:21,22]      -> R/C  -> POS -> ValueError
                                                          di.at[:,22]         -> R/C  -> POS -> ValueError
                                                          di.at[12,:]         -> R/C  -> POS -> ValueError

检索DataFrame系列中的元素有哪些不同的方法？

1 个答案:

案例1：具有整数索引

案例2：具有字符串索引