爱好者 - 蟒蛇新手
您好,我正在使用Wes McKinney的Python for Data Analysis一书。我刚刚开始研究MovieLens 1M数据集,但就我而言,我无法让我的代码用于ratings.dat文件。它适用于movies.dat和users.dat文件,但我一直收到ratings.dat文件的错误。我从github和movielens.org下载了ratings.dat的副本,但是我得到了同样的错误。我已重命名该文件,但我仍然得到相同的错误。我转移到另一个目录,但我仍然得到同样的错误。我猜我有一些配置问题?
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)]
Type "copyright", "credits" or "license" for more information.
IPython 0.13.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
%guiref -> A brief reference about the graphical user interface.
Welcome to pylab, a matplotlib-based Python environment [backend: TkAgg].
For more information, type 'help(pylab)'.
import pandas as pd
rnames = ['user_id','movie_id','rating','timestamp']
ratings = pd.read_table('e:\ratings.dat',sep='',header=None,names=rnames)
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-1-5513dd9baafa> in <module>()
3 rnames = ['user_id','movie_id','rating','timestamp']
4
----> 5 ratings = pd.read_table('e:\ratings.dat',sep='',header=None,names=rnames)
6
E:\Python27_new\lib\site-packages\pandas\io\parsers.pyc in parser_f(filepath_or_buffer, sep, dialect, compression, doublequote, escapechar, quotechar, quoting, skipinitialspace, lineterminator, header, index_col, names, prefix, skiprows, skipfooter, skip_footer, na_values, true_values, false_values, delimiter, converters, dtype, usecols, engine, delim_whitespace, as_recarray, na_filter, compact_ints, use_unsigned, low_memory, buffer_lines, warn_bad_lines, error_bad_lines, keep_default_na, thousands, comment, decimal, parse_dates, keep_date_col, dayfirst, date_parser, memory_map, nrows, iterator, chunksize, verbose, encoding, squeeze)
397 buffer_lines=buffer_lines)
398
--> 399 return _read(filepath_or_buffer, kwds)
400
401 parser_f.__name__ = name
E:\Python27_new\lib\site-packages\pandas\io\parsers.pyc in _read(filepath_or_buffer, kwds)
206
207 # Create the parser.
--> 208 parser = TextFileReader(filepath_or_buffer, **kwds)
209
210 if nrows is not None:
E:\Python27_new\lib\site-packages\pandas\io\parsers.pyc in __init__(self, f, engine, **kwds)
505 self.options['has_index_names'] = kwds['has_index_names']
506
--> 507 self._make_engine(self.engine)
508
509 def _get_options_with_defaults(self, engine):
E:\Python27_new\lib\site-packages\pandas\io\parsers.pyc in _make_engine(self, engine)
607 def _make_engine(self, engine='c'):
608 if engine == 'c':
--> 609 self._engine = CParserWrapper(self.f, **self.options)
610 else:
611 if engine == 'python':
E:\Python27_new\lib\site-packages\pandas\io\parsers.pyc in __init__(self, src, **kwds)
888 # #2442
889 kwds['allow_leading_cols'] = self.index_col is not False
--> 890 self._reader = _parser.TextReader(src, **kwds)
891
892 # XXX
E:\Python27_new\lib\site-packages\pandas\_parser.pyd in pandas._parser.TextReader.__cinit__ (pandas\src\parser.c:2771)()
E:\Python27_new\lib\site-packages\pandas\_parser.pyd in pandas._parser.TextReader._setup_parser_source (pandas\src\parser.c:4810)()
atings.dat does not exist
错误的最后一行始终将文件名的第一部分截断。如前所述,相同的代码适用于movies.dat和users.dat。
答案 0 :(得分:2)
尝试将转义添加到源路径e:\ratings.dat
到e:\\ratings.dat
答案 1 :(得分:1)
您应该将pathstring写为原始字符串(注意它之前的r
):
ratings = pd.read_table(r'e:\ratings.dat', sep='', header=None, names=rnames)
这不起作用的原因是因为\r
具有特殊含义(回车),它不是文件路径的一部分,这意味着python无法找到该文件。原始字符串会转义所有特殊字符
您可以在以下内容中看到:
In [1]: print ('\r')
In [2]: print (r'\r')
\r
等等,你可以像@pravin建议的那样“逃避”每个\
个字符(使用\\
)。