Question

在mlflow中，您可以使用可在UI中折叠的流畅项目API运行嵌套运行。例如。通过使用以下代码（有关UI支持，请参见this）：

with mlflow.start_run(nested=True):
  mlflow.log_param("mse", 0.10)
  mlflow.log_param("lr", 0.05)
  mlflow.log_param("batch_size", 512)
  with mlflow.start_run(nested=True):
    mlflow.log_param("max_runs", 32)
    mlflow.log_param("epochs", 20)
    mlflow.log_metric("acc", 98)
    mlflow.log_metric("rmse", 98)
  mlflow.end_run()

由于数据库连接问题，我想在整个应用程序中使用单个mlflow客户端。

如何堆叠运行，例如对于超参数优化，是否使用通过MlflowClient().create_run()创建的运行？

Answer 1

实现起来有点复杂，但我通过查看直接使用 mlflow 导入时使用的 Fluent Tracking Interface 找到了一种方法。

在 start_run 函数中，您可以看到 nested_run 只是通过设置特定标签 mlflow.utils.mlflow_tags.MLFLOW_PARENT_RUN_ID 来定义的。只需将其设置为父运行的 run.info.run_id 值，它就会在 UI 中正确显示。

这是一个例子：

from mlflow.tracking import MlflowClient
from mlflow.utils.mlflow_tags import MLFLOW_PARENT_RUN_ID

client = MlflowClient()
try:
    experiment = client.create_experiment("test_nested")
except:
    experiment = client.get_experiment_by_name("test_nested").experiment_id
parent_run = client.create_run(experiment_id=experiment)
client.log_param(parent_run.info.run_id, "who", "parent")

child_run_1 = client.create_run(
        experiment_id=experiment,
        tags={
            MLFLOW_PARENT_RUN_ID: parent_run.info.run_id
        }
    )
client.log_param(child_run_1.info.run_id, "who", "child 1")

child_run_2 = client.create_run(
        experiment_id=experiment,
        tags={
            MLFLOW_PARENT_RUN_ID: parent_run.info.run_id
        }
    )
client.log_param(child_run_2.info.run_id, "who", "child 2")

如果您想知道：也可以使用 mlflow.utils.mlflow_tags.MLFLOW_RUN_NAME 标记以这种方式指定运行名称。

使用MLflowClient

1 个答案: