Question

我正在尝试使MLFlow在本地网络上的另一台计算机上运行，我想寻求帮助，因为我现在不知道该怎么办。

我有一个在服务器上运行的mlflow服务器。 mlflow服务器正在我的用户下在 server 上运行，并已按以下方式启动：

mlflow server --host 0.0.0.0 --port 9999 --default-artifact-root sftp://<MYUSERNAME>@<SERVER>:<PATH/TO/DIRECTORY/WHICH/EXISTS>

应将所有数据记录到mlflow服务器的程序如下：

from mlflow import log_metric, log_param, log_artifact, set_tracking_uri

if __name__ == "__main__":
    remote_server_uri = '<SERVER>' # this value has been replaced
    set_tracking_uri(remote_server_uri)
    # Log a parameter (key-value pair)
    log_param("param1", 5)

    # Log a metric; metrics can be updated throughout the run
    log_metric("foo", 1)
    log_metric("foo", 2)
    log_metric("foo", 3)

    # Log an artifact (output file)
    with open("output.txt", "w") as f:
        f.write("Hello world!")
    log_artifact("output.txt")

参数get和指标被传输到服务器，但不是工件。为什么会这样？

关于SFTP部分的说明：我可以通过SFTP登录并安装pysftp软件包

Answer 1

我想您的问题是您还需要创建实验，以便使用sftp远程存储

mlflow.create_experiment("my_experiment", artifact_location=sftp_uri)

这对我来说是固定的。

Answer 2

我不知道我是否能解决我的问题，但我确实以这种方式解决了。

在服务器上，我创建了目录/var/mlruns。我通过--backend-store-uri file:///var/mlruns

将此目录传递给mlflow

然后我通过例如挂载该目录sshfs在我本地计算机上的同一路径下。

我不喜欢这种解决方案，但是到目前为止，它已经很好地解决了这个问题。

远程服务器上的工件存储和MLFLow

2 个答案: