无法在if语句

时间:2018-01-03 03:38:04

标签: python list if-statement

我正在尝试创建一个允许我在列表中检索标记化数据值的循环,检查标记化单元格值中是否有停用词并将其附加到新列表中。

# Importing the packages to be used

import xlrd
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords

# Declaration of file path of the data and opening of workbook and worksheet

file_path = "C:/Users/L31101/Documents/Data/Copy_1.xlsx"
workbook = xlrd.open_workbook(file_path)
worksheet = workbook.sheet_by_name("ConsolidateModuleQnComment")

# Grabs the numbers of rows and columns of the worksheet

rowcount = worksheet.nrows
columncount = worksheet.ncols

# Prints the number of row and columns

print("\nRow count: %d" % rowcount)
print("Column count: %d" % columncount)

# Grabbing the cell values and placing them inside an array named data_value

data_value = []

for rowindex in range(2, rowcount):
    # print("\nCurrent row number: %d" % rowindex)
    # print(worksheet.cell_value(rowindex, 6))
    data_value.append(worksheet.cell_value(rowindex, 6))

# Grabbing the values inside data_value cell and tokenizes them, and then adds them into the data_tokenized array

data_tokenized = []

for valueindex in range(0, len(data_value)):
    data_tokenized.append(word_tokenize(data_value[valueindex]))

# Grabbing the tokenized values from the data_tokenized array and removing the stopwords

stop_words = set(stopwords.words("english"))

data_stopword_removed = []

for tokenizedindex in range(0, len(data_tokenized)):
    if data_tokenized[tokenizedindex] not in stop_words:
        data_stopword_removed.append(data_tokenized[tokenizedindex])

print("\nNumber of records: %d" % len(data_stopword_removed))

它提供以下错误消息

C:\Users\L31101\PycharmProjects\Year3\venv\Scripts\python.exe C:/Users/L31101/PycharmProjects/Year3/SentimentAnalysis.py

Row count: 5792
Column count: 7
Traceback (most recent call last):
  File "C:/Users/L31101/PycharmProjects/Year3/SentimentAnalysis.py", line 47, in <module>
    if test_variable not in stop_words:
TypeError: unhashable type: 'list'

Process finished with exit code 1

我有什么想法可以解决这个问题吗?

1 个答案:

答案 0 :(得分:0)

尝试在错误发生前打印test_variable。这将是一个清单。列表不能放入集合中,因为列表是可变的,并且没有必需的__hash__方法。如果无法将列表放入集合中,则无法在集合中搜索列表。因此错误unhashable type

如果不知道你在这里测试的是什么,我不能说你的修正是什么。但不管它是什么,你都需要对list以外的其他东西进行测试。