我有一个名为DNA的Python对象。我想创建100个DNA实例。每个实例都包含一个对所有实例都相同的pandas数据帧。为避免重复,我想将此数据帧合并为静态/类属性。
import pandas as pd
some_df = pd.DataFrame()
class DNA(object):
df = some_variable # Do i declare here?
def __init__(self,df = pd.DataFrame(), name='1'):
self.name = name
self.instance_df = instance_df # I want to avoid this
DNA.some_df = df # Does this duplicate the data for every instance?
这样做的正确方法是什么?
我希望能够更改我用作类变量的数据帧,但是一旦加载了类,它就需要在所有实例中引用相同的值(即相同的内存)。
答案 0 :(得分:3)
我在评论中回答了你的问题:
import pandas as pd
some_df = pd.DataFrame()
class DNA(object):
df = some_variable # You assign here. I would use `some_df`
def __init__(self,df = pd.DataFrame(), name='1'):
self.name = name
self.instance_df = instance_df # Yes, avoid this
DNA.some_df = df # This does not duplicate, assignment **never copies in Python** However, I advise against this
所以,使用
DNA.some_df = df
在__init__
内部可以正常工作。由于默认参数仅在函数定义时评估一次,因此df
始终是相同的df,除非您明确地将 new df传递给{{1}但是那对我来说有点糟糕的设计。相反,你可能想要这样的东西:
__init__
假设您想在任何时候使用它来改变它:
class DNA(object):
def __init__(self,df = pd.DataFrame(), name='1'):
self.name = name
<some work to construct a dataframe>
df = final_processing_function()
DNA.df = df
注意:
DNA.df = new_df
但要小心,当你分配给一个实例时,Python会带你到你的话:
In [5]: class A:
...: pass
...:
In [6]: a1 = A()
In [7]: a2 = A()
In [8]: a3 = A()
In [9]: A.class_member = 42
In [10]: a1.class_member
Out[11]: 42
In [11]: a2.class_member
Out[11]: 42
In [12]: a3.class_member
Out[12]: 42
通过检查实例的名称空间和类对象本身来反映这一点:
In [14]: a2.class_member = 'foo' # this shadows the class variable with an instance variable in this instance...
In [15]: a1.class_member
Out[15]: 42
In [16]: a2.class_member # really an instance variable now!
Out[16]: 'foo'