我在课堂上定义了一个函数
class X:
def __init__(self, logger, tableDataLoader, dataCleanser, timeSeriesFunctions):
self.logger = logger
self.tableDataLoader = tableDataLoader
self.dataCleanser = dataCleanser
self.timeSeriesFunctions = timeSeriesFunctions
def preProcess(self, inputLocForTrain, inputLocForTest, outputLoc, region, gl):
# Do Something
我正试图通过这样定义的多处理类来调用此函数preProcess
class ProcessManager:
def __init__(self, spark, logger):
self.spark = spark
self.logger = logger
def applyMultiProcessExecution(self, func_arguments, targetFunction, iterableList):
self.logger.info("Function Arguments : {}".format(func_arguments))
jobs = []
for x in iterableList:
try:
p = Process(target=targetFunction, args=(x,), kwargs=func_arguments)
jobs.append(p)
p.start()
except:
raise RuntimeError("Unable to create process for GL : {}".format(x))
for job in jobs:
job.join()
现在我这样叫我的ProcessManager
processManager = ProcessManager(spark=spark, logger=logger)
dataFetcherFactory = DataFetcherFactory(logger)
dataFetcher = dataFetcherFactory.getDataFetcher(pipelineType=pipelineType)
dataCleanser = DataCleanser(logger)
timeSeriesFunctions = TimeSeriesFunctions(logger)
tableDataLoader = TableDataLoader(logger=logger, dataFetcher=dataFetcher, dataCleanser=dataCleanser,
timeSeriesFunctions=timeSeriesFunctions)
preProcessDataForPCAModel = X(logger=logger,
tableDataLoader=tableDataLoader,
dataCleanser=dataCleanser,
timeSeriesFunctions=timeSeriesFunctions)
arguments = {FeatureConstants.INPUT_LOCATION_FOR_TRAIN: inputLocForTrain,
FeatureConstants.INPUT_LOCATION_FOR_TEST: inputLocForTest,
FeatureConstants.OUTPUT_LOCATION: outputLoc,
REGION: region}
processManager.applyMultiProcessExecution(func_arguments=arguments,
targetFunction=preProcessDataForPCAModel.preProcess,
iterableList=[504])
这返回我错误: 流程1:
Traceback (most recent call last):
File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
TypeError: preProcess() got multiple values for keyword argument 'inputLocForTrain'
我浏览了几篇stackoverflow帖子,人们认为这是由于自变量作为类的一部分而出现的。我无法理解如何解决我的问题,因为我需要将构造函数参数作为自身的一部分出现才能进行计算。
有人可以让我知道如何解决这个问题吗?
答案 0 :(得分:1)
尝试更改:
def preProcess(self, inputLocForTrain, inputLocForTest, outputLoc, region, gl):
收件人:
def preProcess(self, gl, inputLocForTrain, inputLocForTest, outputLoc, region):