Question

我开发了两个整理函数来从 h5py 文件中读取数据（我在这里尝试为 MWE 创建一些合成数据，但它没有计划）。

两者在处理我的数据方面的差异大约是 10 倍——增幅非常大，我不确定原因，我很好奇对我未来的整理功能的见解。

def slow(batch):
    '''
    This function retrieves the data emitted from the H5 torch data set.
    It alters the emitted dimensions from the dataloader
    from: [batch_sz, layers, tokens, features], to:
    [layers, batch_sz, tokens, features]
    '''
    embeddings = []
    start_ids = []
    end_ids = []
    idxs = []
    for i in range(len(batch)):
        embeddings.append(batch[i]['embeddings'])
        start_ids.append(batch[i]['start_ids'])
        end_ids.append(batch[i]['end_ids'])
        idxs.append(batch[i]['idx'])
    # package data;    # swap to expected [layers, batch_sz, tokens, features]
    sample = {'embeddings': torch.as_tensor(embeddings).permute(1, 0, 2, 3),
              'start_ids': torch.as_tensor(start_ids),
              'end_ids': torch.as_tensor(end_ids),
              'idx': torch.as_tensor(idxs)}
    return sample

我认为下面有更多循环的那个会更慢，但事实并非如此。

def fast(batch):
    ''' This function alters the emitted dimensions from the dataloader
    from: [batch_sz, layers, tokens, features]
    to: [layers, batch_sz, tokens, features] for the embeddings
    '''
    # turn data to tensors
    embeddings = torch.stack([torch.as_tensor(item['embeddings']) for item in batch])
    # swap to expected [layers, batch_sz, tokens, features]
    embeddings = embeddings.permute(1, 0, 2, 3)
    # get start ids
    start_ids = torch.stack([torch.as_tensor(item['start_ids']) for item in batch])
    # get end ids
    end_ids = torch.stack([torch.as_tensor(item['end_ids']) for item in batch])
    # get idxs
    idxs = torch.stack([torch.as_tensor(item['idx']) for item in batch])
    # repackage
    sample = {'embeddings': embeddings,
              'start_ids': start_ids,
              'end_ids': end_ids}
    return sample

编辑：我尝试换到这个：与“快速”相比，它仍然慢了大约 10 倍。

def slow(batch):
    '''
    This function retrieves the data emitted from the H5 torch data set.
    It alters the emitted dimensions from the dataloader
    from: [batch_sz, layers, tokens, features], to:
    [layers, batch_sz, tokens, features]
    '''
    embeddings = []
    start_ids = []
    end_ids = []
    idxs = []
    for item in batch:
        embeddings.append(item['embeddings'])
        start_ids.append(item['start_ids'])
        end_ids.append(item['end_ids'])
        idxs.append(item['idx'])
    # package data;    # swap to expected [layers, batch_sz, tokens, features]
    sample = {'embeddings': torch.as_tensor(embeddings).permute(1, 0, 2, 3),
              'start_ids': torch.as_tensor(start_ids),
              'end_ids': torch.as_tensor(end_ids),
              'idx': torch.as_tensor(idxs)}
    return sample

Answer 1

看到这个答案（并给它一个upvote）： https://stackoverflow.com/a/30245465/10475762

特别是这句话：“换句话说，一般来说，列表推导式执行得更快，因为挂起和恢复函数的框架，或其他情况下的多个函数，比按需创建列表要慢。”

因此，在您的情况下，您在每次整理时多次调用 append，这在您的训练/测试/评估步骤中调用了很多次，所有这些都加起来了。 IMO，请始终避免使用 for 循环，因为它似乎总会以某种方式导致速度变慢。

Torch：为什么这个整理功能比另一个快这么多？

1 个答案: