Question

我正在尝试指定动态数量的图层，这似乎做错了。我的问题是，当我在此处定义100层时，前进会出现错误。但是，当我正确定义图层时，它是否起作用？下面是简化的示例

class PredictFromEmbeddParaSmall(LightningModule):
def __init__(self, hyperparams={'lr': 0.0001}):
    super(PredictFromEmbeddParaSmall, self).__init__()
    #Input is something like tensor.size=[768*100]
    self.TO_ILLUSTRATE = nn.Linear(768, 5)
    self.enc_ref=[]
    for i in range(100):
        self.enc_red.append(nn.Linear(768, 5))
    # gather the layers output sth
    self.dense_simple1 = nn.Linear(5*100, 2)
    self.output = nn.Sigmoid()
def forward(self, x):
    # first input to enc_red
    x_vecs = []
    for i in range(self.para_count):
        layer = self.enc_red[i]
        # The first dim is the batch size here, output is correct
        processed_slice = x[:, i * 768:(i + 1) * 768]
        # This works and give the out of size 5
        rand = self.TO_ILLUSTRATE(processed_slice)
        #This will fail? Error below
        ret = layer(processed_slice)
        #more things happening we can ignore right now since we fail earlier

执行“ ret = layer.forward（processed_slice）”时出现此错误

RuntimeError：设备类型为cuda的预期对象，但设备类型为调用_th_addmm的参数＃1'self'的CPU

是否有更智能的编程方法？或解决错误？

Answer 1

您应该使用pytorch中的ModuleList而不是列表：https://pytorch.org/docs/master/generated/torch.nn.ModuleList.html。这是因为Pytorch必须保留一个图形，其中包含模型的所有模块，如果仅将它们添加到列表中，则它们在图形中的索引不正确，从而导致您遇到错误。

您的想法应该是相似的：

class PredictFromEmbeddParaSmall(LightningModule):
def __init__(self, hyperparams={'lr': 0.0001}):
    super(PredictFromEmbeddParaSmall, self).__init__()
    #Input is something like tensor.size=[768*100]
    self.TO_ILLUSTRATE = nn.Linear(768, 5)
    self.enc_ref=nn.ModuleList()                     # << MODIFIED LINE <<
    for i in range(100):
        self.enc_red.append(nn.Linear(768, 5))
    # gather the layers output sth
    self.dense_simple1 = nn.Linear(5*100, 2)
    self.output = nn.Sigmoid()
def forward(self, x):
    # first input to enc_red
    x_vecs = []
    for i in range(self.para_count):
        layer = self.enc_red[i]
        # The first dim is the batch size here, output is correct
        processed_slice = x[:, i * 768:(i + 1) * 768]
        # This works and give the out of size 5
        rand = self.TO_ILLUSTRATE(processed_slice)
        #This will fail? Error below
        ret = layer(processed_slice)
        #more things happening we can ignore right now since we fail earlier

然后它应该可以正常工作！

编辑：另一种方法。

除了使用ModuleList之外，您还可以只使用nn.Sequential，这样可以避免在正向传递中使用for循环。这也意味着您将无权访问中间激活，因此如果您需要它们，则不是解决方案。

class PredictFromEmbeddParaSmall(LightningModule):
def __init__(self, hyperparams={'lr': 0.0001}):
    super(PredictFromEmbeddParaSmall, self).__init__()
    #Input is something like tensor.size=[768*100]
    self.TO_ILLUSTRATE = nn.Linear(768, 5)
    self.enc_ref=[]
    for i in range(100):
        self.enc_red.append(nn.Linear(768, 5))

    self.enc_red = nn.Seqential(*self.enc_ref)       # << MODIFIED LINE <<
    # gather the layers output sth
    self.dense_simple1 = nn.Linear(5*100, 2)
    self.output = nn.Sigmoid()
def forward(self, x):
    # first input to enc_red
    x_vecs = []
    out = self.enc_red(x)                            # << MODIFIED LINE <<

pytorch动态图层数量？

1 个答案: