无法在另一个虚拟环境中加载joblib

时间:2020-10-14 19:17:38

标签: python amazon-s3 scikit-learn pickle joblib

我创建了一个机器学习模型并将其转储(到S3存储桶中)。现在,我想在另一个虚拟环境中使用它,但是我收到一个缺少的模块错误:

import boto3
import botocore
import os
import joblib

BUCKET_NAME = 'ml-models'
KEY = 'model_2.4.joblib'


def download_s3_model():
    """
    Downloads a pickled model from S3 and loads it with joblib.
    :return: unpickled model
    """
    # Make s3 connection
    s3 = boto3.client('s3')

    # Create directory if not exist
    if not os.path.exists('s3_models'):
        os.makedirs('s3_models')

    # Try to download the S3 file
    try:
        s3.download_file(BUCKET_NAME, KEY, f's3_models/local_{KEY}')
    except botocore.exceptions.ClientError as e:
        if e.response['Error']['Code'] == "404":
            print("The object does not exist.")
        else:
            raise

    return joblib.load(open(f's3_models/local_{KEY}', 'rb'))

现在当我尝试使用此功能时

model = download_s3_model()

我收到这个ModuleNotFoundError

ModuleNotFoundError: No module named 'heartdisease'

心脏病是我在另一个虚拟环境中创建的模块。

这是将模型写入S3的功能

def write_to_S3(data_bucket, data_key, model_version, bucket_name):
    """
    Train the model on the entire dataset and save it in memory to
    subsequently write it so an S3 bucket on AWS.
    """
    df = heartdisease.get_S3_df(data_bucket, data_key)
    X = df.drop(columns='target')
    y = df['target']
    fitted_model = fit(RF, X, y)

    key = f'model_{model_version}.joblib'

    with tempfile.TemporaryFile() as file:
        joblib.dump(fitted_model, file)
        file.seek(0)

        s3_resource = boto3.resource('s3')
        s3_resource.Object(bucket_name, key).put(Body=file.read())

我该如何解决?

0 个答案:

没有答案