Question

我来这里是因为我遇到了一个最模糊的问题。当我开始创建我的模型时，我遇到了我的 gpu 使用高峰，然后我的 python 代码崩溃。这仅在我尝试仅使用“from_pretrained”中的任何模型时才会发生，Tensorflow 和 PyTourch 本身都没有问题（这种行为仅适用于转换器）

例如：运行这行代码时出现问题，就在我脚本的开头；

model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased')

我收到以下消息，这些消息非常标准，但正如您在底部看到的那样，代码只是停止了。

<
2021-04-16 16:16:35.330093: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-04-16 16:16:38.495667: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-04-16 16:16:38.519178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1760] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s
2021-04-16 16:16:38.519500: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-04-16 16:16:38.528695: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-04-16 16:16:38.528923: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-04-16 16:16:38.533582: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-04-16 16:16:38.535368: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-04-16 16:16:38.540093: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-04-16 16:16:38.543728: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-04-16 16:16:38.544662: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-04-16 16:16:38.544888: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1898] Adding visible gpu devices: 0
2021-04-16 16:16:38.545436: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-04-16 16:16:38.546588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1760] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s
2021-04-16 16:16:38.547283: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1898] Adding visible gpu devices: 0
2021-04-16 16:16:39.115250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1300] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-04-16 16:16:39.115490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0
2021-04-16 16:16:39.115592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1319] 0: N
2021-04-16 16:16:39.115856: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1446] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4634 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2021-04-16 16:16:39.419407: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-04-16 16:16:39.709427: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll

Process finished with exit code -1073741819 (0xC0000005)

有没有其他人看过这个？有什么我在这里想念的吗？感谢您的帮助。

这是我的系统的详细信息。

变形金刚版本：最新平台：Windows Python版本：3.7 PyTorch 版本（GPU？）：最新 Tensorflow 版本（GPU？）：最新在脚本中使用 GPU？：是的，GeForce GTX 1060 计算能力：6.1 在脚本中使用分布式或并行设置？：否我遇到此错误的模型：

与此问题相关的模型：阿尔伯特，伯特，xlm：

--更新：我进一步将问题缩小到所有 TF[ModelName] 预训练模型。

从预训练加载 Bert 模型时 Python 崩溃

0 个答案: