Aws Sagemaker - ModuleNotFoundError:没有名为“cv2”的模块

时间:2021-04-14 13:46:26

标签: opencv image-processing amazon-ec2 object-detection amazon-sagemaker

我正在尝试在 Aws 中运行对象检测代码。虽然 opencv 列在需求文件中,但我有错误“没有名为 cv2 的模块”。我不知道如何解决这个错误。有人可以帮我吗。

我的requirement.txt文件有

  • opencv-python
  • numpy>=1.18.2
  • scipy>=1.4.1
  • wget>=3.2
  • tensorflow==2.3.1
  • tensorflow-gpu==2.3.1
  • tqdm==4.43.0
  • 熊猫
  • boto3
  • awscli
  • urllib3
  • 女士

我也尝试安装“imgaug”和“opencv-python headless”..但仍然无法摆脱这个错误。

sh-4.2$ python train_launch.py 
[INFO-ROLE] arn:aws:iam::021945294007:role/service-role/AmazonSageMaker-ExecutionRole-20200225T145269
train_instance_type has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
train_instance_count has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
train_instance_type has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
2021-04-14 13:29:58 Starting - Starting the training job...
2021-04-14 13:30:03 Starting - Launching requested ML instances......
2021-04-14 13:31:11 Starting - Preparing the instances for training......
2021-04-14 13:32:17 Downloading - Downloading input data...
2021-04-14 13:32:41 Training - Downloading the training image..WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/__init__.py:1473: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.

2021-04-14 13:33:03,970 sagemaker-containers INFO     Imported framework sagemaker_tensorflow_container.training
2021-04-14 13:33:05,030 sagemaker-containers INFO     Invoking user script

Training Env:

{
    "additional_framework_parameters": {},
    "channel_input_dirs": {
        "training": "/opt/ml/input/data/training"
    },
    "current_host": "algo-1",
    "framework_module": "sagemaker_tensorflow_container.training:main",
    "hosts": [
        "algo-1"
    ],
    "hyperparameters": {
        "unfreezed_epochs": 2,
        "freezed_batch_size": 8,
        "freezed_epochs": 1,
        "unfreezed_batch_size": 8,
        "model_dir": "s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model"
    },
    "input_config_dir": "/opt/ml/input/config",
    "input_data_config": {
        "training": {
            "TrainingInputMode": "File",
            "S3DistributionType": "FullyReplicated",
            "RecordWrapperType": "None"
        }
    },
    "input_dir": "/opt/ml/input",
    "is_master": true,
    "job_name": "yolov4-2021-04-14-15-29",
    "log_level": 20,
    "master_hostname": "algo-1",
    "model_dir": "/opt/ml/model",
    "module_dir": "s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_smal/yolov4-2021-04-14-15-29/source/sourcedir.tar.gz",
    "module_name": "train_indu",
    "network_interface_name": "eth0",
    "num_cpus": 8,
    "num_gpus": 1,
    "output_data_dir": "/opt/ml/output/data",
    "output_dir": "/opt/ml/output",
    "output_intermediate_dir": "/opt/ml/output/intermediate",
    "resource_config": {
        "current_host": "algo-1",
        "hosts": [
            "algo-1"
        ],
        "network_interface_name": "eth0"
    },
    "user_entry_point": "train_indu.py"
}

Environment variables:

SM_HOSTS=["algo-1"]
SM_NETWORK_INTERFACE_NAME=eth0
SM_HPS={"freezed_batch_size":8,"freezed_epochs":1,"model_dir":"s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model","unfreezed_batch_size":8,"unfreezed_epochs":2}
SM_USER_ENTRY_POINT=train_indu.py
SM_FRAMEWORK_PARAMS={}
SM_RESOURCE_CONFIG={"current_host":"algo-1","hosts":["algo-1"],"network_interface_name":"eth0"}
SM_INPUT_DATA_CONFIG={"training":{"RecordWrapperType":"None","S3DistributionType":"FullyReplicated","TrainingInputMode":"File"}}
SM_OUTPUT_DATA_DIR=/opt/ml/output/data
SM_CHANNELS=["training"]
SM_CURRENT_HOST=algo-1
SM_MODULE_NAME=train_indu
SM_LOG_LEVEL=20
SM_FRAMEWORK_MODULE=sagemaker_tensorflow_container.training:main
SM_INPUT_DIR=/opt/ml/input
SM_INPUT_CONFIG_DIR=/opt/ml/input/config
SM_OUTPUT_DIR=/opt/ml/output
SM_NUM_CPUS=8
SM_NUM_GPUS=1
SM_MODEL_DIR=/opt/ml/model
SM_MODULE_DIR=s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_smal/yolov4-2021-04-14-15-29/source/sourcedir.tar.gz
SM_TRAINING_ENV={"additional_framework_parameters":{},"channel_input_dirs":{"training":"/opt/ml/input/data/training"},"current_host":"algo-1","framework_module":"sagemaker_tensorflow_container.training:main","hosts":["algo-1"],"hyperparameters":{"freezed_batch_size":8,"freezed_epochs":1,"model_dir":"s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model","unfreezed_batch_size":8,"unfreezed_epochs":2},"input_config_dir":"/opt/ml/input/config","input_data_config":{"training":{"RecordWrapperType":"None","S3DistributionType":"FullyReplicated","TrainingInputMode":"File"}},"input_dir":"/opt/ml/input","is_master":true,"job_name":"yolov4-2021-04-14-15-29","log_level":20,"master_hostname":"algo-1","model_dir":"/opt/ml/model","module_dir":"s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_smal/yolov4-2021-04-14-15-29/source/sourcedir.tar.gz","module_name":"train_indu","network_interface_name":"eth0","num_cpus":8,"num_gpus":1,"output_data_dir":"/opt/ml/output/data","output_dir":"/opt/ml/output","output_intermediate_dir":"/opt/ml/output/intermediate","resource_config":{"current_host":"algo-1","hosts":["algo-1"],"network_interface_name":"eth0"},"user_entry_point":"train_indu.py"}
SM_USER_ARGS=["--freezed_batch_size","8","--freezed_epochs","1","--model_dir","s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model","--unfreezed_batch_size","8","--unfreezed_epochs","2"]
SM_OUTPUT_INTERMEDIATE_DIR=/opt/ml/output/intermediate
SM_CHANNEL_TRAINING=/opt/ml/input/data/training
SM_HP_UNFREEZED_EPOCHS=2
SM_HP_FREEZED_BATCH_SIZE=8
SM_HP_FREEZED_EPOCHS=1
SM_HP_UNFREEZED_BATCH_SIZE=8
SM_HP_MODEL_DIR=s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model
PYTHONPATH=/opt/ml/code:/usr/local/bin:/usr/lib/python36.zip:/usr/lib/python3.6:/usr/lib/python3.6/lib-dynload:/usr/local/lib/python3.6/dist-packages:/usr/lib/python3/dist-packages

Invoking script with the following command:

/usr/bin/python3 train_indu.py --freezed_batch_size 8 --freezed_epochs 1 --model_dir s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model --unfreezed_batch_size 8 --unfreezed_epochs 2


WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/__init__.py:1473: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 4667030854237447206
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 3059419181456814147
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 6024475084695919958
physical_device_desc: "device: XLA_GPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 14949928141
locality {
  bus_id: 1
  links {
  }
}
incarnation: 13034103301168381073
physical_device_desc: "device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5"
]
Traceback (most recent call last):
  File "train_indu.py", line 12, in <module>
    from yolov3.dataset import Dataset
  File "/opt/ml/code/yolov3/dataset.py", line 3, in <module>
    import cv2
ModuleNotFoundError: No module named 'cv2'
2021-04-14 13:33:08,453 sagemaker-containers ERROR    ExecuteUserScriptError:
Command "/usr/bin/python3 train_indu.py --freezed_batch_size 8 --freezed_epochs 1 --model_dir s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model --unfreezed_batch_size 8 --unfreezed_epochs 2"

2021-04-14 13:33:11 Uploading - Uploading generated training model
2021-04-14 13:33:54 Failed - Training job failed
Traceback (most recent call last):
  File "train_launch.py", line 41, in <module>
    estimator.fit(s3_data_path, logs=True, job_name=job_name) #the argument logs is crucial if you want to see what happends
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/estimator.py", line 535, in fit
    self.latest_training_job.wait(logs=logs)
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/estimator.py", line 1210, in wait
    self.sagemaker_session.logs_for_job(self.job_name, wait=True, log_type=logs)
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/session.py", line 3365, in logs_for_job
    self._check_job_status(job_name, description, "TrainingJobStatus")
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/sagemaker/session.py", line 2957, in _check_job_status
    actual_status=status,
sagemaker.exceptions.UnexpectedStatusException: Error for Training job yolov4-2021-04-14-15-29: Failed. Reason: AlgorithmError: ExecuteUserScriptError:
Command "/usr/bin/python3 train_indu.py --freezed_batch_size 8 --freezed_epochs 1 --model_dir s3://sagemaker-dataset-ai/Dataset/yolo/Results/yolov4_small/yolov4-2021-04-14-15-29/model --unfreezed_batch_size 8 --unfreezed_epochs 2"

1 个答案:

答案 0 :(得分:1)

确保您的估算器具有

  • framework_version = '2.3',
  • py_version = 'py37',