Question

我从未使用任何类型的并行处理方法，这是非常新的。我想从SQL Server读取大量数据（即至少200万行），并希望使用并行处理来加快读取速度。下面是我尝试使用并发的未来进程池进行并行处理。

class DatabaseWorker(object):
    def __init__(self, connection_string, n, result_queue = []):
        self.connection_string = connection_string
        stmt = "select distinct top %s * from dbo.KrishAnalyticsAllCalls" %(n)
        self.query = stmt
        self.result_queue = result_queue

    def reading(self,x):
        return(x)

    def pooling(self):   
        t1 = time.time()
        con = pyodbc.connect(self.connection_string)
        curs = con.cursor()
        curs.execute(self.query)                  
        with concurrent.futures.ProcessPoolExecutor(max_workers=8) as executor:
            print("Test1")
            future_to_read = {executor.submit(self.reading, row): row for row in curs.fetchall()}
            print("Test2")
            for future in concurrent.futures.as_completed(future_to_read):
                print("Test3")
                read = future_to_read[future]
                try:
                    print("Test4")
                    self.result_queue.append(future.result())
                except:
                    print("Not working")
        print("\nTime take to grab this data is %s" %(time.time() - t1))   


df = DatabaseWorker(r'driver={SQL Server}; server=SPROD_RPT01; database=Reporting;', 2*10**7)
df.pooling()

我的当前实现没有得到任何输出。 "Test1"打印就是这样。没有其他事情发生。我理解并发未来文档提供的各种示例，但我无法在此处实现。我将非常感谢你的帮助。谢谢。

Python：使用并发未来流程池读取SQL Server数据

0 个答案: