BrokenProcessPool对语言模型的预训练

时间:2019-11-18 20:33:03

标签: nlp classification text-classification language-model fast-ai

我一直在从Wikipedia训练语言模型,以便在FastAi中创建文本分类器。 我一直在使用Google colab。 但是经过几分钟的培训,该过程因以下错误而停止:

get_wiki(path,lang)
dest = split_wiki(path,lang)

bs=64
data = (TextList.from_folder(dest)
                .split_by_rand_pct(0.1, seed=42)
                .label_for_lm()
                .databunch(bs=bs, num_workers=1))
data.save('tmp_lm')


BrokenProcessPool                         Traceback (most recent call last)
<ipython-input-4-60a9d06cb522> in <module>()
      1 bs=64
      2 data = (TextList.from_folder(dest)
----> 3                 .split_by_rand_pct(0.1, seed=42)
      4                 .label_for_lm()
      5                 .databunch(bs=bs, num_workers=1))

9 frames
/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in _inner(*args, **kwargs)
    478             self.valid = fv(*args, from_item_lists=True, **kwargs)
    479             self.__class__ = LabelLists
--> 480             self.process()
    481             return self
    482         return _inner

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self)
    532         "Process the inner datasets."
    533         xp,yp = self.get_processors()
--> 534         for ds,n in zip(self.lists, ['train','valid','test']): ds.process(xp, yp, name=n)
    535         #progress_bar clear the outputs so in some case warnings issued during processing disappear.
    536         for ds in self.lists:

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self, xp, yp, name, max_warn_items)
    712                     p.warns = []
    713                 self.x,self.y = self.x[~filt],self.y[~filt]
--> 714         self.x.process(xp)
    715         return self
    716 

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self, processor)
     82         if processor is not None: self.processor = processor
     83         self.processor = listify(self.processor)
---> 84         for p in self.processor: p.process(self)
     85         return self
     86 

/usr/local/lib/python3.6/dist-packages/fastai/text/data.py in process(self, ds)
    295         tokens = []
    296         for i in progress_bar(range(0,len(ds),self.chunksize), leave=False):
--> 297             tokens += self.tokenizer.process_all(ds.items[i:i+self.chunksize])
    298         ds.items = tokens
    299 

/usr/local/lib/python3.6/dist-packages/fastai/text/transform.py in process_all(self, texts)
    118         if self.n_cpus <= 1: return self._process_all_1(texts)
    119         with ProcessPoolExecutor(self.n_cpus) as e:
--> 120             return sum(e.map(self._process_all_1, partition_by_cores(texts, self.n_cpus)), [])
    121 
    122 class Vocab():

/usr/lib/python3.6/concurrent/futures/process.py in _chain_from_iterable_of_lists(iterable)
    364     careful not to keep references to yielded objects.
    365     """
--> 366     for element in iterable:
    367         element.reverse()
    368         while element:

/usr/lib/python3.6/concurrent/futures/_base.py in result_iterator()
    584                     # Careful not to keep a reference to the popped future
    585                     if timeout is None:
--> 586                         yield fs.pop().result()
    587                     else:
    588                         yield fs.pop().result(end_time - time.monotonic())

/usr/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
    430                 raise CancelledError()
    431             elif self._state == FINISHED:
--> 432                 return self.__get_result()
    433             else:
    434                 raise TimeoutError()

/usr/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.


我试图通过将bs的值更改为64、32、16来解决此问题。同样更改了num_workers的值,但仍然失败。

该过程如下,ram内存开始填充,大约在过程的一半已满,并停止脚本的执行。

Google Colab Machine的详细信息:

GPU机,RAM:25.51 GB,磁盘:358.27 GB。

最好的问候!

0 个答案:

没有答案