我正在学习scikit学习进行某些分类。我正在遵循我的数据集教程。当我运行脚本时,我收到类型错误
data = pd.DataFrame({'Description': pd.Categorical(["apple", "table", "red"]), 'Labels' : pd.Categorical(["Fruit","Furniture","Color"])})
counts = CountVectorizer().fit_transform(data['Description'].values)
tf_transformer = TfidfTransformer(use_idf=False).fit(counts)
train_tf = tf_transformer.transform(tf_transformer)
我得错误
Traceback (most recent call last):
File "/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 3035, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-97-9a649172d3b7>", line 10, in <module>
train_tf = tf_transformer.transform(tf_transformer)
File "/anaconda/lib/python2.7/site-packages/sklearn/feature_extraction/text.py", line 1005, in transform
X = sp.csr_matrix(X, dtype=np.float64, copy=copy)
File "/anaconda/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 69, in __init__
self._set_self(self.__class__(coo_matrix(arg1, dtype=dtype)))
File "/anaconda/lib/python2.7/site-packages/scipy/sparse/coo.py", line 204, in __init__
self.data = self.data.astype(dtype)
TypeError: float() argument must be a string or a number
我必须做一些非常愚蠢的事情因为我不完全理解API。有人可以告诉我如何解锁自己吗?
感谢。
答案 0 :(得分:1)
错误来自此
tf_transformer.transform(tf_transformer)
我认为这是错误的语法tf_transformer
是TfidfTransformer
的对象。该函数需要稀疏矩阵。相反,您可以使用fit_transform
函数
tf_transformer = TfidfTransformer(use_idf=False).fit_transform(counts)