PyMongo Cursor到array / DataFrame的快速方法

时间:2018-01-21 12:41:44

标签: python mongodb pandas numpy

我从mongoDB服务器获取了只有2个日期时间字段的43k记录。将光标转换为数组到数据帧需要大约10秒到20秒。我尝试过各种方法来做到这一点:

尝试-1

docs=pd.DataFrame(list(collection.find({"dId":4},{"ts":1,"cd":1,"_id":0}).sort("ts",-1).limit(43200)))

尝试-2

docs=collection.find({"dId":4},{"ts":1,"cd":1,"_id":0}).sort("ts",-1).limit(43200)
ts = np.empty([2,43200], dtype='datetime64[us]')
for i, doc in enumerate(docs):
    ts[0][i] = doc["ts"]
    ts[1][i] = doc["cd"]
df = pd.DataFrame()
df['ts']=ts[0]
df['cd']=ts[1]

尝试-3

docs=collection.find({"dId":4},{"ts":1,"cd":1,"_id":0}).sort("ts",-1).limit(43200)
arr= [x for x in docs]

尝试-4

my_data = list(map(lambda x: list(x.values()), collection.find({"dId":4},{"ts":1,"cd":1,"_id":0}).sort("ts",-1).limit(43200)))
result=np.array(my_data)

如何将此时间减少到不到2秒?

0 个答案:

没有答案