应用错误收集

我的数据class 'scipy.sparse.csr.csr_matrix'看起来像

  (0, 55)   1
  (0, 54)   1
  (1, 55)   1
  (1, 54)   1
  (1, 55)   1
  (1, 54)   1
  (2, 945)  1
  (2, 945)  1
  (2, 950)  1
  ...

我需要改变它。首先我尝试使用

sklearn.feature_extraction.text.TfidfTransformer

但它没有提高roc_auc的价值接下来我尝试使用

from sklearn.feature_extraction.text import TfidfVectorizer
tfidf = TfidfVectorizer(max_features=13, max_df=0.5, min_df=0.1, ngram_range=(1, 2))
data_tfidf = tfidf.fit_transform(data)

但它返回错误

AttributeError: lower not found

我该如何解决？

Sklearn：使用TfidfVectorizer

0 个答案: