ParserError:标记数据时出错。 C错误:第2624行预计有2503个字段,见52523

时间:2017-10-03 06:53:58

标签: python pandas dataframe

我使用pandas read_csv函数来读取我的csv文件。

feature_file_df_5=pd.read_csv('/home/jayashree/Documents/Nokia/DataSet/SMT Data Analytics/SPI (Solder Paste Inspection)/086990A-108-FHFB-TRX-985676H-BOTTOM-N_0608_2001_2500.csv',header=501)

我正面临解析器错误

/home/jayashree/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in read(self, nrows)
   1717     def read(self, nrows=None):
   1718         try:
-> 1719             data = self._reader.read(nrows)
   1720         except StopIteration:
   1721             if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read (pandas/_libs/parsers.c:10862)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory (pandas/_libs/parsers.c:11138)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows (pandas/_libs/parsers.c:11884)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows (pandas/_libs/parsers.c:11755)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error (pandas/_libs/parsers.c:28765)()

ParserError: Error tokenizing data. C error: Expected 2503 fields in line 2624, saw 52523

根据thread的建议,我尝试将sep选项添加为

feature_file_df_5=pd.read_csv('/home/jayashree/Documents/Nokia/DataSet/SMT Data Analytics/SPI (Solder Paste Inspection)/086990A-108-FHFB-TRX-985676H-BOTTOM-N_0608_2001_2500.csv', sep=',',header=501)

得到同样的错误 当我使用sep = None

`feature_file_df_5=pd.read_csv('/home/jayashree/Documents/Nokia/DataSet/SMT Data Analytics/SPI (Solder Paste Inspection)/086990A-108-FHFB-TRX-985676H-BOTTOM-N_0608_2001_2500.csv', sep=None,header=`501)

我收到此错误

/home/jayashree/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in _rows_to_cols(self, content)
   2782                 msg = ('Expected %d fields in line %d, saw %d' %
   2783                        (col_len, row_num + 1, actual_len))
-> 2784                 if len(self.delimiter) > 1 and self.quoting != csv.QUOTE_NONE:
   2785                     # see gh-13374
   2786                     reason = ('Error could possibly be due to quotes being '

TypeError: object of type 'NoneType' has no len()


  [1]: https://stackoverflow.com/questions/18039057/python-pandas-error-tokenizing-data

在电子表格中打开时,我发现所有行都存在任何问题。 如何解决错误。

1 个答案:

答案 0 :(得分:0)

您应该尝试参数quotingquotechar,它们可以帮助进行文件字段结构化。 更多细节在这里: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

或者如果只有一个(或几个)可以省略的断行,请使用error_bad_lines=False