Sagemaker Pytorch:无法从训练脚本中导入sklearn

时间:2019-06-02 01:16:59

标签: python amazon-sagemaker

我正在使用Sagemaker的pytorch模型

from sagemaker.pytorch import PyTorch

estimator = PyTorch(entry_point='train.py',
                    role=role,
                    framework_version='1.0.0',
                    train_instance_count=1,
                    train_instance_type='ml.m4.xlarge',
                    source_dir='source', #the directory where the supporting files are

                    #what is passed in
                    hyperparameters={
                        'max_epochs' : 6,
                        'layer_dim'  : "2500,500,100,1",
                        'batch_size' : 64,
                        'seed'       : 4524,
                        'cuda'       : False
                        }
                   )

其中输入脚本train.py包含多个导入

import argparse
import math
import os
from shutil import copy
import time
import torch
import torch.nn as nn
from sklearn.preprocessing import StandardScaler

但是sklearn调用失败:

  File "/opt/ml/code/train.py", line 9, in <module>
    from sklearn.preprocessing import StandardScaler
ModuleNotFoundError: No module named 'sklearn'

问题:

  1. 在这种情况下如何使用sklearn函数?
  2. 是否可以不通过自定义docker路由添加其他pip安装?

1 个答案:

答案 0 :(得分:1)

是的,您无需安装自定义dockerfile路由(也就是BYO容器)就可以安装依赖项

在您的代码中使用它(其中mypackage代表您选择的点子包)

import subprocess as sb 
import sys 

sb.call([sys.executable, "-m", "pip", "install", mypackage])