Python多处理For循环

时间:2020-10-24 16:53:59

标签: python python-multiprocessing

我的脚本不起作用。所有10个进程都采用第一个列表项,然后停止。输出10x 1列表项 如何解决这个问题?错误一定在循环中,或者我需要为此排队吗?

import finanzen_fundamentals.stocks as ff
import mysql.connector
import pandas as pd
import multiprocessing
import time

results = []

def get_list():
    try:
        mydb = mysql.connector.connect( host="localhost", user="changed", password="changed", database="stockdata")
        mycursor = mydb.cursor()
        mycursor.execute("select * from url_name")
        record = mycursor.fetchall()
        return record
    except Exception as e:
        return str(e)

def create_json(record):
    for row in record:
        try:
            df = ff.get_current_value_lxml(str(row[2])[:-1], exchange = "FSE")
            print('Name:' + row[0] + ' WKN:' + df['wkn'].values[0] + ' Preis:' + str(df['price'].values[0]) + ' Currency:' + df['currency'].values[0] + ' Zeit:' + df['time'].values[0])
            result = [[row[0], df['wkn'].values[0], df['price'].values[0], df['currency'].values[0], df['time'].values[0]]]
            return result
        except Exception as e:
            print(str(e))

def collect_results(result):
     results.extend(result)

if __name__ == '__main__':
    record = get_list()
    start_time = time.time()
    pool = multiprocessing.Pool(processes=multiprocessing.cpu_count())
    for i in range(10):
        pool.apply_async(create_json, args=(record, ), callback=collect_results)
    pool.close()
    pool.join()

    df_out = pd.DataFrame(results, columns=['Name', 'WKN', 'Preis', 'Currency', 'Zeit'])
    print(df_out)

输出:

                      Name     WKN  Preis Currency        Zeit
0  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020
1  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020
2  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020
3  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020
4  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020
5  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020
6  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020
7  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020
8  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020
9  21VIANET GRP ADR A/6 O.  A1H9DT   20.0      EUR  23.10.2020

1 个答案:

答案 0 :(得分:2)

您弄错了循环结构。在create_json中,您正在循环row的{​​{1}},但是在第一次迭代时,总是使用相同的原始record列表和 record来调用它。因此,所有工人将始终只在第一线工作。您需要更改worker函数才能在return上进行操作:

row

然后在主代码中对每一行进行调用:

def create_json(row):
    try:
        df = ff.get_current_value_lxml(str(row[2])[:-1], exchange = "FSE")
        print('Name:' + row[0] + ' WKN:' + df['wkn'].values[0] + ' Preis:' + str(df['price'].values[0]) + ' Currency:' + df['currency'].values[0] + ' Zeit:' + df['time'].values[0])
        result = [[row[0], df['wkn'].values[0], df['price'].values[0], df['currency'].values[0], df['time'].values[0]]]
        return result
    except Exception as e:
        print(str(e))

请注意,在这种情况下,您可以仅使用apply_async而不是循环调用map。它甚至已经返回了结果列表,因此您甚至不再需要if __name__ == '__main__': ... for row in record: pool.apply_async(create_json, args=(row, ), callback=collect_results) ... ,例如:

callback