Python线程运行函数返回值

时间:2019-09-27 14:17:53

标签: python-3.x pandas multithreading dataframe

我有pandasDataframes,我想在上面应用一个函数。我想进行多次迭代,因此我认为使用多个threads会很好。

它是这样的:

def my_function(data_inputs_train):

    #..... do something with dataframe....
    #..... group by for loops etc .......
    #..... create new dataframe.....
    return newPandasDataFrame

class myThread (threading.Thread):

   def __init__(self, threadID, data_inputs_train):

      threading.Thread.__init__(self)

      self.threadID = threadID
      self.data_inputs_train = data_inputs_train

   def run(self):
      result_df = my_function(data_inputs_train)

thread1 = myThread(1, data_inputs_train)
thread2 = myThread(2, data_inputs_train)

所以两个线程都应该返回一个新的数据帧,并且在两个线程完成之后,我想将两个线程返回的两个结果连接起来。

我该怎么做?如何从run()函数返回任何对象,如何在我的thread1对象中访问它?

谢谢!

通过第一个答案进行更新,但它不起作用,也存在缩进问题。

class myThread (threading.Thread):

   def __init__(self, threadID, name, sleep, cust_type, data_inputs_train):

      threading.Thread.__init__(self)

      self.threadID = threadID
      self.name = name
      self.sleep = sleep
      self.cust_type = cust_type
      self.data_inputs_train = data_inputs_train
      #here i need to get the newPandasDataFrame object.
      result_df = fdp.optimze_score_and_cl(data_inputs_train)

    def returnTheData(self):
        return result_df

1 个答案:

答案 0 :(得分:0)

所以这是您程序的基础。.我只是在使用示例数据来说明如何设置它

    def myFunction(x):
       df = pd.DataFrame(['1', '2'], columns = ['A'])
       return df

    class myThreads(threading.Thread):
        def __init__(self, threadID, name, sleep, cust_type, data_inputs_train):
           threading.Thread.__init__(self)
           self.threadID = threadID
           self.name = name
           self.sleep = sleep
           self.cust_type = cust_type

           # call the methods you need on your data...
           self.data_inputs_train = myFunction(data_inputs_train)

         def returnTheData(self):
           return self.data_inputs_train


df = pd.DataFrame(['1'], columns = ['A'])
thread1 = myThreads(1, "EX1", 1, 'EX', df)
thread2 = myThreads(2, "IN1", 2, 'IN', df)
thread1.start()
thread2.start()

thread1.join()
thread2.join()

df1 = thread1.returnTheData()
df2 = thread2.returnTheData()

print(df1)
print(df2)

您声明线程。.启动它们,基本上让它们运行所需的线程。.

  

join()

允许main函数等待所有线程完成其处理。

  

df2 = thread2.returnTheData()

您只需调用一个函数即可返回所需的数据。

工作代码

  

https://repl.it/repls/ClearHugeOutlier