我有以下代码尝试使用XL转换器对文本进行矢量化:
text = "Some string about 5000 characters long"
tokenizer = TransfoXLTokenizerFast.from_pretrained('transfo-xl-wt103', cache_dir=my_local_dir, local_files_only=True)
model = TransfoXLModel.from_pretrained("transfo-xl-wt103", cache_dir=my_local_dir, local_files_only=True)
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
这将产生:
output = model(**encoded_input)
File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/user/w/default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 863, in forward
output_attentions=output_attentions,
File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/user/w/default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 385, in forward
dec_inp, r, attn_mask=dec_attn_mask, mems=mems, head_mask=head_mask, output_attentions=output_attentions,
File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/user/w//default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 338, in forward
attn_score = attn_score.float().masked_fill(attn_mask[:, :, :, None], -1e30).type_as(attn_score)
RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 2007869696 bytes. Error code 12 (Cannot allocate memory)
对此我有些困惑,因为这要求的是2007869696,它只有2GB,并且该机器具有64G的RAM。因此,我俩都不明白为什么要提出这个要求,甚至不明白为什么它没能做到。
在哪里可以更改控制此设置并允许该进程获得更多RAM的设置?这只是示例代码的一小部分调用,我只看到很少的地方甚至接受此参数。
答案 0 :(得分:2)
确定要使用gpu而不是cpu吗?
尝试使用CUDA_LAUNCH_BLOCKING=1 python script.py
运行python脚本。这将产生正确的python堆栈跟踪(因为CUDA调用是异步的)
您还可以使用CUDA_VISIBLE_DEVICES
设置export CUDA_VISIBLE_DEVICES=device_number
。
在pytorch github上还有一个issue仍在打开,请尝试将其检出。