DataFrame属性和列之间有什么区别

时间:2018-11-16 03:00:17

标签: python pandas dataframe

In [66]: data
Out[66]: 
   col1 col2 label
0   1.0    a     c
1   2.0    b     d
2   3.0    c     e
3   0.0    d     f
4   4.0    e     0
5   5.0    f     0

In [67]: data.label
Out[67]: 
0      c
1      d
2    NaN
3      f
4    NaN
5    NaN
Name: col2, dtype: object

In [68]: data['label']
Out[68]: 
0    c
1    d
2    e
3    f
4    0
5    0
Name: label, dtype: object

为什么data.label和data ['label']显示不同的结果?

2 个答案:

答案 0 :(得分:0)

我注意到的最大区别是分配。

import random
import pandas as pd

s = "SummerCrime|WinterCrime".split("|")
j = {x: [random.choice(["ASB", "Violence", "Theft", "Public Order", "Drugs"]) for j in range(300)] for x in s}
df = pd.DataFrame(j)
df.FallCrime = [random.choice(["ASB", "Violence", "Theft", "Public Order", "Drugs"]) for j in range(300)]

礼物:UserWarning: Pandas doesn't allow columns to be created via a new attribute name

但是,与此相关的文档也有:https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access

其中有以下警告可能与您的问题有关:

- You can use this access only if the index element is a valid Python
   identifier, e.g. s.1 is not allowed. 
 - The attribute will not be available if it
   conflicts with an existing method name, e.g. s.min is not allowed.
 - Similarly, the attribute will not be available if it conflicts with
   any of the following list: index, major_axis, minor_axis, items. In
   any of these cases, standard indexing will still work, e.g. s['1'],
   s['min'], and s['index'] will access the corresponding element or
   column.

他们继续说:

You can use attribute access to modify an existing element of a Series or column of a
DataFrame, but be careful; if you try to use attribute access to create a new column,
it creates a new attribute rather than a new column.
**In 0.21.0 and later, this will raise a UserWarning** 

(所以您可能没有意识到就这样做了)

答案 1 :(得分:0)

这两者之间的差异与分配有关。使用data.label时,您无法将值分配给列。

data.label用于访问属性,而data [“ label”]用于分配值。

如果列名中有空格,例如df['label name'],而在使用data.label name时也会出现错误。

有关更多信息,请参见此 Answer link