我正在使用Python 2.7使用Surprise Library阅读Book-Crossings数据集。 我遇到了编码问题。 这是我的代码:
#load book_ratings dataset which contains (user-id, ISBN and book rating [0..10])
df = pd.read_csv("Desktop\ML project\BX-CSV-Dump\BX-Book-Ratings.csv", sep=';',encoding = "CP1252")
# Create the reader in the proper format
reader = Reader(line_format='user item rating', sep=';', rating_scale=(0,10))
ratings = Dataset.load_from_df(df, reader)
# Test that surprise is working by running SVD on the dataset
ratings.split(5)
algo = SVD()
evaluate(algo, ratings, measures=['RMSE', 'MAE'])
我得到的例外:
UnicodeEncodeError Traceback (most recent call last) <ipython-input-23-03cf8b43a4db> in <module>()
9 ratings.split(5)
10 algo = SVD()
---> 11 evaluate(algo, ratings, measures=['RMSE', 'MAE'])
C:\Users\Jihed Mestiri\Anaconda2\lib\site-packages\scikit_surprise-latest-py2.7-win-amd64.egg\surprise\evaluate.pyc
in evaluate(algo, data, measures, with_dump, dump_dir, verbose)
66 # train and test algorithm. Keep all rating predictions in a list
67 algo.train(trainset)
---> 68 predictions = algo.test(testset, verbose=(verbose == 2))
69
70 # compute needed performance statistics
C:\Users\Jihed Mestiri\Anaconda2\lib\site-packages\scikit_surprise-latest-py2.7-win-amd64.egg\surprise\prediction_algorithms\algo_base.pyc
in test(self, testset, verbose)
150 r_ui_trans - self.trainset.offset,
151 verbose=verbose)
--> 152 for (uid, iid, r_ui_trans) in testset]
153 return predictions
154
C:\Users\Jihed Mestiri\Anaconda2\lib\site-packages\scikit_surprise-latest-py2.7-win-amd64.egg\surprise\prediction_algorithms\algo_base.pyc
in predict(self, uid, iid, r_ui, clip, verbose)
93 iiid = self.trainset.to_inner_iid(iid)
94 except ValueError:
---> 95 iiid = 'UKN__' + str(iid)
96
97 details = {}
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in position 4: ordinal not in range(128)
我尝试过其他编码代码,但它也没有用。它提供了类似的错误。