Question

我尝试使用字典作为我之前代码的输出来实现simple_tokenize，但是我收到了一条错误消息。任何有关以下代码的帮助将不胜感激。我正在使用Python 2.7 Jupyter

import csv
reader = csv.reader(open('data.csv'))

dictionary = {}
for row in reader:
    key = row[0]
    dictionary[key] = row[1:]
print dictionary

上述方法效果很好，但问题如下：

import re

words = dictionary
split_regex = r'\W+'

def simple_tokenize(string):

    for i in rows:
        word = words.split
    #pass

print word

我收到此错误：

NameError                                 Traceback (most recent call last)
<ipython-input-2-0d0e05fb1556> in <module>()
      1 import re
      2 
----> 3 words = dictionary
      4 split_regex = r'\W+'
      5 

NameError: name 'dictionary' is not defined

Answer 1

除非explicitly do so yourself，否则不会在Jupyter会话之间保存变量。因此，如果您运行第一个代码段，然后退出Jupyter会话，启动新的Jupyter会话并运行第二个代码块，dictionary不会从第一个会话中保留，因此将是未定义的，如错误。

如果你以不同的方式运行上面的代码块（例如，不是跨越Jupyter会话），你应该指出这一点，但标签和回溯表明这就是你所做的。

用于python的Jupyter字符串标记化

1 个答案: