当我在多个 GPU 上运行 Pytorch Dataparallel 时它不起作用

时间:2021-06-15 22:10:04

标签: pytorch

import torch
import torch.nn.functional as F
from torch.nn import Linear
from torch_geometric.nn import SAGEConv, GlobalAttention,GATConv
import SuperGATConv as sgat

class myGlobalAttentionGATNet3(torch.nn.Module):

    def __init__(self, num_node_features, hidden_channels):
        super(myGlobalAttentionGATNet3, self).__init__()
        self.conv1 = GATConv(num_node_features, int(num_node_features/2))
        self.conv2 = GATConv(int(num_node_features/2), hidden_channels)
        self.pooling_gate_nn = Linear(hidden_channels, 1)
        self.pooling = GlobalAttention(self.pooling_gate_nn)
        self.lin = Linear(hidden_channels, num_node_features)

    def reset_parameters(self):
        self.conv1.reset_parameters()
        self.conv2.reset_parameters()
        self.pooling.reset_parameters()
        self.lin.reset_parameters()


    def forward(self, data):
        x, edge_index, batch = data.x, data.edge_index, data.batch
        x = self.conv1(x, edge_index)
        x = x.relu()
        x = self.conv2(x, edge_index)
        x = self.pooling(x, batch)
        x = F.relu(x)
        x = F.dropout(x, p=0.5, training=self.training)
        x = self.lin(x)
        return x
model = myGlobalAttentionGATNet3(dataset.num_node_features,hidden_channels=hidden_channels)

model=nn.DataParallel(model,device_ids=[0,1,2,3])

model.to(device)

该代码适用于一个 GPU。当我在多个 GPU 上运行它时,它给了我以下错误。

运行时错误:在设备 1 上的副本 1 中捕获了运行时错误。

RuntimeError: 'out' 的预期张量与参数 #3 'mat2' 的张量具有相同的设备;但是设备 0 不等于 1(在检查 addmm 的参数时)

0 个答案:

没有答案