我有一个使用ML API在pyspark中构建的随机森林模型。我想访问随机森林模型中的单个树并为每棵树执行预测,并获得树间预测的方差。我如何平行化此预测
from pyspark.ml.classification import RandomForestClassifier
forest = RandomForestClassifier(numTrees=5)
forest = forest.fit(cars_train)
forest_trees=forest.trees
forest_trees
forest_trees[0].tranform(cars_train)
[DecisionTreeClassificationModel (uid=dtc_aa66702a4ce9) of depth 5 with 17 nodes,
DecisionTreeClassificationModel (uid=dtc_99f7efedafe9) of depth 5 with 31 nodes,
DecisionTreeClassificationModel (uid=dtc_9306e4a5fa1d) of depth 5 with 21 nodes,
DecisionTreeClassificationModel (uid=dtc_d643bd48a8dd) of depth 5 with 23 nodes,
DecisionTreeClassificationModel (uid=dtc_a2d5abd67969) of depth 5 with 27 node
我可以对每棵树进行预测。但是要并行执行此操作,并在所有并行预测中获得标准差。怎么做