我无法使用pd.read_csv使用不同数量的期望值来读取此CSV文件

时间:2018-11-01 17:45:36

标签: python pandas csv

我已经尝试了几个小时才能读取此文件。我已经尝试研究解决方案并应用它们。他们没有工作。该文件本身可以在Excel上很好地打开,但是我不能用Pandas读取它。

响应不断返回相同的错误:ParserError: Expected 3 fields in line 5, saw 63

我还看到了有关此主题的其他一些问题,但是这些特定问题的解决方案都无法解决我的问题。

有人知道为什么我无法读取此文件以及如何解决该文件吗?谢谢

IN:

data=pd.read_csv('API_EN.ATM.CO2E.PC_DS2_en_csv_v2_10181020.csv',
                 header=None,
                 engine='python',
                error_bad_lines=True)

输出:

ParserError                               Traceback (most recent call last)
<ipython-input-96-0d42116a039d> in <module>()
      2                  header=None,
      3                  engine='python',
----> 4                 error_bad_lines=True)

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, doublequote, delim_whitespace, low_memory, memory_map, float_precision)
    676                     skip_blank_lines=skip_blank_lines)
    677 
--> 678         return _read(filepath_or_buffer, kwds)
    679 
    680     parser_f.__name__ = name

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
    444 
    445     try:
--> 446         data = parser.read(nrows)
    447     finally:
    448         parser.close()

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, nrows)
   1034                 raise ValueError('skipfooter not supported for iteration')
   1035 
-> 1036         ret = self._engine.read(nrows)
   1037 
   1038         # May alter columns / col_dict

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in read(self, rows)
   2264             content = content[1:]
   2265 
-> 2266         alldata = self._rows_to_cols(content)
   2267         data = self._exclude_implicit_index(alldata)
   2268 

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _rows_to_cols(self, content)
   2907                     msg += '. ' + reason
   2908 
-> 2909                 self._alert_malformed(msg, row_num + 1)
   2910 
   2911         # see gh-13320

~\Anaconda3\lib\site-packages\pandas\io\parsers.py in _alert_malformed(self, msg, row_num)
   2674 
   2675         if self.error_bad_lines:
-> 2676             raise ParserError(msg)
   2677         elif self.warn_bad_lines:
   2678             base = 'Skipping line {row_num}: '.format(row_num=row_num)

ParserError: Expected 3 fields in line 5, saw 63

以下是CSV文件的示例:

"Country_Name","Country_Code","Indicator_Name","Indicator_Code","1960","1961","1962","1963","1964","1965","1966","1967","1968","1969","1970","1971","1972","1973","1974","1975","1976","1977","1978","1979","1980","1981","1982","1983","1984","1985","1986","1987","1988","1989","1990","1991","1992","1993","1994","1995","1996","1997","1998","1999","2000","2001","2002","2003","2004","2005","2006","2007","2008","2009","2010","2011","2012","2013","2014","2015","2016","2017",
"Aruba","ABW","CO2 emissions (metric tons per capita)","EN.ATM.CO2E.PC","","","","","","","","","","","","","","","","","","","","","","","","","","","2.86831939212055","7.23519803341258","10.0261792105306","10.6347325992922","26.3745032100275","26.0461298009966","21.4425588041328","22.000786163522","21.0362451108214","20.7719361585578","20.3183533653846","20.4268177083943","20.5876691453648","20.311566765912","26.1948752380219","25.9340244138733","25.6711617820448","26.4204520857169","26.5172934158421","27.200707780588","26.9482604728658","27.8955739972338","26.2308466448946","25.9158329472761","24.6705288731078","24.5058352032767","13.1555416906324","8.35129425218293","8.408362637892","","","",

2 个答案:

答案 0 :(得分:0)

尝试在pd.read_csv()中更改sep参数的值。

答案 1 :(得分:0)

将代码更改为

data=pd.read_csv('API_EN.ATM.CO2E.PC_DS2_en_csv_v2_10181020.csv', header=None, engine='python', error_bad_lines=False)

将导入您的csv,但不会正确导入您的csv。您的csv和使用的分隔符可能存在某些问题。您能张贴要导入的csv的第5行吗?例如,最后一列是否包含带有逗号的文本?您期望多少列:3、63或其他?