我首先尝试使用Tensorflow 2.3,但是它没有用,所以我降级到Tensorflow 2.2,但它仍然显示相同的错误。我究竟做错了什么?以下是我的tensorflow,CUDA和CUDNN版本。
bash-4.2 $ pip3列表
...
tensorboard 2.2.2
tensorboard-plugin-wit 1.7.0
tensorflow 2.2.0
tensorflow-estimator 2.2.0
...
bash-4.2 $ cd usr / local
bash-4.2 $ ls
bin build cuda-10.0 cuda-10.2 cuda-7.0 cuda-8.0 cuda-9.1 etc include lib64 LICENSE python3 sbin src
boost cuda cuda-10.1 cuda-11.0 cuda-7.5 cuda-9.0 cuda-9.2 games lib libexec NOTICE Readme.md share
bash-4.2 $ cd cuda-10.1 / lib64
bash-4.2 $ ls
...
libcudnn.so libcudnn.so.7 libcudnn.so.7.5.0 libcudnn.so.7.6.4 libcudnn.so.7.6.5
...
我需要执行python文件from_keras.py
文件from_keras.py:
import tvm
from tvm import te
import tvm.relay as relay
from tvm.contrib.download import download_testdata
import keras
import numpy as np
######################################################################
# Load pretrained keras model
# ----------------------------
# We load a pretrained resnet-50 classification model provided by keras.
weights_url = "".join(
[
"https://github.com/fchollet/deep-learning-models/releases/",
"download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5",
]
)
weights_file = "resnet50_weights.h5"
weights_path = download_testdata(weights_url, weights_file, module="keras")
keras_resnet50 = keras.applications.resnet50.ResNet50(
include_top=True, weights=None, input_shape=(224, 224, 3), classes=1000
)
keras_resnet50.load_weights(weights_path)
######################################################################
# Load a test image
# ------------------
# A single cat dominates the examples!
from PIL import Image
from matplotlib import pyplot as plt
from keras.applications.resnet50 import preprocess_input
img_url = "https://github.com/dmlc/mxnet.js/blob/main/data/cat.png?raw=true"
img_path = download_testdata(img_url, "cat.png", module="data")
img = Image.open(img_path).resize((224, 224))
plt.imshow(img)
plt.show()
# input preprocess
data = np.array(img)[np.newaxis, :].astype("float32")
data = preprocess_input(data).transpose([0, 3, 1, 2])
print("input_1", data.shape)
######################################################################
# Compile the model with Relay
# ----------------------------
# convert the keras model(NHWC layout) to Relay format(NCHW layout).
shape_dict = {"input_1": data.shape}
mod, params = relay.frontend.from_keras(keras_resnet50, shape_dict)
# compile the model
target = "cuda"
ctx = tvm.gpu(0)
with tvm.transform.PassContext(opt_level=3):
executor = relay.build_module.create_executor("graph", mod, ctx, target)
######################################################################
# Execute on TVM
# ---------------
dtype = "float32"
tvm_out = executor.evaluate()(tvm.nd.array(data.astype(dtype)), **params)
top1_tvm = np.argmax(tvm_out.asnumpy()[0])
#####################################################################
# Look up synset name
# -------------------
# Look up prediction top 1 index in 1000 class synset.
synset_url = "".join(
[
"https://gist.githubusercontent.com/zhreshold/",
"4d0b62f3d01426887599d4f7ede23ee5/raw/",
"596b27d23537e5a1b5751d2b0481ef172f58b539/",
"imagenet1000_clsid_to_human.txt",
]
)
synset_name = "imagenet1000_clsid_to_human.txt"
synset_path = download_testdata(synset_url, synset_name, module="data")
with open(synset_path) as f:
synset = eval(f.read())
print("Relay top-1 id: {}, class name: {}".format(top1_tvm, synset[top1_tvm]))
# confirm correctness with keras output
keras_out = keras_resnet50.predict(data.transpose([0, 2, 3, 1]))
top1_keras = np.argmax(keras_out)
print("Keras top-1 id: {}, class name: {}".format(top1_keras, synset[top1_keras]))
bash-4.2 $ python3 from_keras.py
File /uac/y16/jhchoi6/.tvm_test_data/keras/resnet50_weights.h5 exists, skip.
2020-10-16 13:37:06.154190: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-10-16 13:37:06.263238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:18:00.0 name: TITAN Xp computeCapability: 6.1
coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s
2020-10-16 13:37:06.264335: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-10-16 13:37:06.270718: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-10-16 13:37:06.276767: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-10-16 13:37:06.277829: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-10-16 13:37:06.284514: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-10-16 13:37:06.287641: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-10-16 13:37:06.298349: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-10-16 13:37:06.301762: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-10-16 13:37:06.302407: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-10-16 13:37:06.325261: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2200000000 Hz
2020-10-16 13:37:06.325619: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5d73580 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-10-16 13:37:06.325678: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-10-16 13:37:06.498867: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5de6040 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-10-16 13:37:06.498934: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): TITAN Xp, Compute Capability 6.1
2020-10-16 13:37:06.500142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:18:00.0 name: TITAN Xp computeCapability: 6.1
coreClock: 1.582GHz coreCount: 30 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 510.07GiB/s
2020-10-16 13:37:06.500270: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-10-16 13:37:06.500314: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-10-16 13:37:06.500356: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-10-16 13:37:06.500397: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-10-16 13:37:06.500437: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-10-16 13:37:06.500478: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-10-16 13:37:06.500519: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-10-16 13:37:06.502436: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-10-16 13:37:06.503788: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-10-16 13:37:06.503830: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-10-16 13:37:06.503869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-10-16 13:37:06.505899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11324 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:18:00.0, compute capability: 6.1)
2020-10-16 13:37:07.195397: F ./tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: invalid configuration argument
Aborted (core dumped)
bash-4.2 $ nvidia-smi
Fri Oct 16 13:54:43 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06 Driver Version: 450.51.06 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 TITAN Xp Off | 00000000:18:00.0 Off | N/A |
| 23% 30C P8 8W / 250W | 2MiB / 12196MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+