使用pytorch进行图分类

时间:2020-10-28 21:59:19

标签: python-3.x machine-learning deep-learning pytorch

我正在尝试执行图形分类。我有一个看起来像这样的DGL图列表

DGLGraph(num_nodes=64267, num_edges=155523,
         ndata_schemes={}
         edata_schemes={'norm': Scheme(shape=(), dtype=torch.float32), 'rel_type': Scheme(shape=(17,), dtype=torch.float64)

及其对应的标签为张量([0,1])。由于我对PyTorch并不熟悉,因此我在遵循一个示例。 型号定义如下

def gcn_message(edges):
    # The argument is a batch of edges.
    # This computes a (batch of) message called 'msg' using the source node's feature 'h'.
    return {'msg' : edges.src['h']}

def gcn_reduce(nodes):
    # The argument is a batch of nodes.
    # This computes the new 'h' features by summing received 'msg' in each node's mailbox.
    return {'h' : torch.sum(nodes.mailbox['msg'], dim=1)}

# Define the GCNLayer module
class GCNLayer(nn.Module):
    def __init__(self, in_feats, out_feats):
        super(GCNLayer, self).__init__()
        self.linear = nn.Linear(in_feats, out_feats)

    def forward(self, g, inputs):
        # g is the graph and the inputs is the input node features
        # first set the node features
        g.ndata['h'] = inputs
        # trigger message passing on all edges
        g.send(g.edges(), gcn_message)
        # trigger aggregation at all nodes
        g.recv(g.nodes(), gcn_reduce)
        # get the result node features
        h = g.ndata.pop('h')
        # perform linear transformation
        return self.linear(h)

class GCN(nn.Module):
    def __init__(self, in_feats, hidden_size, num_classes):
        super(GCN, self).__init__()
        self.gcn1 = GCNLayer(in_feats, hidden_size)
        self.gcn2 = GCNLayer(hidden_size, num_classes)

    def forward(self, g, inputs):
        h = self.gcn1(g, inputs)
        h = torch.relu(h)
        h = self.gcn2(g, h)
        return h
# The first layer transforms input features of size of 41 to a hidden size of 5.
# The second layer transforms the hidden layer and produces output features of
# size 2, corresponding to the two classification groups
net = GCN(7, 16, 2)

for epoch in range(epochs):
    epoch_loss = 0
    epoch_logits = []
    labs = []
    # Iterate over batches
    for i, (bg, labels) in enumerate(train_loader):
        logits = net(bg, bg.ndata['h'])
        # we save the logits for visualization later
        train_logits.append(logits.detach().numpy())
        epoch_logits.append(logits.detach().numpy()) 
        labs.append(labels.unsqueeze(1).detach().numpy())
        logp = F.softmax(logits, 1)
        loss = loss_fn(logp,labels)

在调用loss_fn时,出现错误ValueError:预期输入batch_size(64267)与目标batch_size(64)相匹配。由于某种原因,图形中的每个节点都被视为不同的输入值,并且模型返回一个它的前兆。排序规则定义为

def collate(samples):
    # The input `samples` is a list of pairs
    #  (graph, label).
    graphs, labels = map(list, zip(*samples))
    batched_graph = dgl.batch(graphs, node_attrs='h')
    batched_graph.set_n_initializer(dgl.init.zero_initializer)
    batched_graph.set_e_initializer(dgl.init.zero_initializer)
    return batched_graph, torch.stack(labels)

我在另一个使用dgllife.model.model_zoo.GCNPredictor的模型中使用此整理,它不会产生任何问题。

0 个答案:

没有答案