我正在尝试在数据上应用BERT,但是它非常大,并且出现内存错误。因此,我想批量执行以下代码“ input_ids”吗?
with torch.no_grad():
last_hidden_states = (model(input_ids))
整个代码如下:
tokenized = df['text'].apply((lambda x: tokenizer.encode(x, add_special_tokens=True)))
# finding max len for each rows
count = list()
for loop1 in range(len(tokenized)):
count.append(len(tokenized[loop1]))
max_count = max(count)
# padding
pad_len = max_count + 3
padded = [np.pad(text,(0,pad_len-len(text)),'constant') for text in tokenized if len(text)<pad_len]
# input tensor
input_ids = torch.tensor(np.array(padded)).to(torch.int64)
# training on the BERT model
with torch.no_grad():
last_hidden_states = model(input_ids)
# Slice the output for the first position for all the sequences, take all hidden unit outputs
features = last_hidden_states[0][:,0,:].numpy()
我尝试了以下代码,但所得张量在原始张量的形状上有所不同(如果未分批完成):
last_hidden_states = list()
loop4 = 0
gap = 4
for loop3 in range(0,20,gap):
with torch.no_grad():
last_hidden_states.extend(model(input_ids[loop4:loop3+gap]))
loop4 += gap
有什么主意吗?