我在python中尝试pool():
import pandas as pd
import csv
from pymongo import MongoClient
import codecs
import re
import codecs
from multiprocessing.dummy import Pool as ThreadPool
# read csv file and put the file into database using pymongo
mongo_client = MongoClient('111.111.11.111',maxPoolSize=200)
db = mongo_client.mydb
def upload(reader):
for each in reader:
row={}
txt = re.split(" ",str(each))
row["time"] = re.split("'",txt[0])[1]
row["ticker"] = txt[1]
row["price"] = re.split("\((.*?)\)", txt[2])[1]
row["open"] = re.split("\((.*?)\)", txt[3])[1]
db.price.insert_one(row)
pool = ThreadPool(10)
results = pool.map(upload, csv.reader(codecs.open('C:\\log.txt', 'rU', 'utf-16')))
我们的想法是将大型log.txt文件分成10个10个池,然后并行运行以优化速度。但是数据库中没有任何更新,这意味着我的代码无法正常工作。这有什么不对? (我确定问题不在于上传功能,因为如果我在没有pool()的情况下运行它可以正常工作)