Question

我正在尝试运行此代码：

with open(Textfile.txt', 'r') as text1:
            raw_text = text1.read().lower()

import re
from nltk.util import ngrams

raw_text = re.sub(r'[^a-zA-Z0-9\s]', ' ', raw_text)

tokens = [token for token in raw_text.split(" ") if token != ""]
# generate ngrams 
output = list(ngrams(tokens, 2))
        return

我收到以下错误：

StopIteration                             Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\util.py in ngrams(sequence, n, pad_left, pad_right, left_pad_symbol, right_pad_symbol)
    467     while n > 1:
--> 468         history.append(next(sequence))
    469         n -= 1

StopIteration: 

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
<ipython-input-11-2ce960ed385c> in <module>()
      7 tokens = [token for token in raw_text.split(" ") if token != ""]
      8 # generate ngrams
----> 9 output = list(ngrams(tokens, 2))
     10 try:
     11     yield next(seq)

RuntimeError: generator raised StopIteration

我的问题是如何在不出现此错误的情况下将ngrams of 2应用于任何Textfile？如果你们能帮助我解决这个问题，那就太好了：）

RuntimeError：生成器在txt文件中生成令牌时引发StopIteration

0 个答案: