Question

For example:

import numpy as np
import datetime

class Test():

    def __init__(self,atti1,atti2):
        self.atti1 = atti1
        self.atti2 = atti2


l1 = [Test(i,i+1) for i in range(1000000)]

My solution is:

start_time = datetime.datetime.now()
l11 = np.array([v.atti1 for v in l1])
l12 = np.array([v.atti2 for v in l1])
print(datetime.datetime.now()-start_time)

It costs 0:00:00.234735 in my macbookpro2017.

It there a more efficient method to make it in python?

---edit1

It is not's not necessary to use numpy.Here is another solution:

l11 = []
l12 = []

start_time = datetime.datetime.now()
for v in l1:
    l11.append(v.atti1)
    l12.append(v.atti2)
print(datetime.datetime.now()-start_time)

It costs 0:00:00.225412

---edit2

Here is a bad solution:

l11 = np.array([])
l12 = np.array([])
start_time = datetime.datetime.now()

for v in l1:
    l11 = np.append(l11,v.atti1)
    l12 = np.append(l12,v.atti2)
print(datetime.datetime.now()-start_time)

Answer 1

这里不需要使用numpy，通常列表理解就足够了。即l11 = [v.atti1 for v in lst]很好。

从概念上讲，您必须遍历所有对象和每个对象的访问属性。

“为什么不应该过度设计”的度量标准：

# numpy array builder
np.array([v.atti1 for v in lst])
np.array([v.atti2 for v in lst])
215 ms ± 3.69 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

这在变慢，因为您首先使用理解构建列表，然后为np数组和复制重新分配内存

# single list iteration with appending
l1 = []
l2 = []
for v in lst:
    l1.append(v.atti1)
    l2.append(v.atti2)
174 ms ± 384 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

更好，但是您需要对.append进行很多函数调用，最终您将重新分配和复制列表。

# thing that you always start with, no pre-mature optimizations
l1 = [v.atti1 for v in lst]
l2 = [v.atti2 for v in lst]
99.3 ms ± 982 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

这是更易读的，Python式的，完全按照其说的做，而且速度更快。在内部，由于对理解的低级优化，它的运行速度更快。

请注意，从3.5（iirc）开始的CPython（您最有可能使用的）使用shared-key dictionaries存储对象属性，从3.6开始，它与紧凑的dict实现合并。两者可以很好地协同工作-内存效率极大地提高了您的原始性能。

不确定在运行理解时VM是否实际上利用了共享指令（可能不是），但是在99％的情况下，这必须留给VM优化。高级抽象语言（例如python）实际上与微优化无关。

Answer 2

您可以使用self.__dict__在Python中返回属性及其值的字典。

import numpy as np
import datetime
import pandas as pd
class Test():
    def __init__(self,atti1,atti2):
        self.atti1 = atti1
        self.atti2 = atti2

    def getAttr(self):
        return self.__dict__


l1 = [Test(i,i+1).getAttr() for i in range(1000000)]

l1 = pd.DataFrame(l1)

l11 = list(l1['atti1'])
l12 = list(l1['atti2'])

The most efficient method to get an object's attributes when the object is in a list in python?

2 个答案:

The most efficient method to get an object&#39;s attributes when the object is in a list in python?

2 个答案:

The most efficient method to get an object's attributes when the object is in a list in python?