我正在构建一个文本分类器,我需要一些帮助。我已经使用了sklearn countvectorizer创建了一个单词包,但是当我尝试通过sklearn分类器运行它时,它给了我ValueError:设置一个带有序列的数组元素。
import pandas as pd
import numpy as np
from scipy import ndimage
from skimage import io
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
df = pd.DataFrame.from_csv('Training_Data.csv')
stopwordslist = frozenset(['and','the','-',',','by','/'])
count_vect = CountVectorizer(stop_words=stopwordslist)
Brandbag = count_vect.fit_transform(df.BrandName)
Brandbag.shape
#output is (6034, 1645). There are 6034 records in my df so that checks out.
df['Brandbag']=Brandbag
tempx = df['Brandbag'].head()
tempy = Y.head()
clf = MultinomialNB().fit(cesc,cescy)
当调用此行时,错误将出现在检查数组函数中
"array = np.array(array, dtype=dtype, order=order, copy=copy)"