我的代码如下:
import numpy as np
from nltk.tokenize import TweetTokenizer
from nltk import pos_tag
class tag_tokenizer:
tokenizer = TweetTokenizer() #learn tokenizing stuff
dt = np.dtype([("token", 'U16') , ("pos_tag","U5")])
def __init__(self, rawDocs):
self.tagged_data = np.array([pos_tag(tokenizer(rawDoc)) for rawDoc in rawDocs], dtype=dt)
但是当我将字符串数组传递到对象的实例时,我得到了错误:
NameError: name 'tokenizer' is not defined
很显然,我的课程结构似乎无法正常工作。我该如何解决?