谷歌colab打开csv文件

时间:2018-06-19 08:00:59

标签: python pandas csv google-colaboratory

我试图通过这种方式在google csv上打开.csv文件

import pandas as pd
from google.colab import files
uploaded = files.upload()

for fn in uploaded.keys():
print('User uploaded file "{name}" with length {length} bytes'.format(
name=fn, length=len(uploaded[fn])))

import io
df = pd.read_csv(io.StringIO(uploaded['test.csv'].decode('utf-8')))

但我收到了错误

ParserError Traceback(最近一次调用最后一次)  in()       1导入io ----> 2 df = pd.read_csv(io.StringIO(已上传[' test.csv'] .solution(' utf-8')))       3打印(df)

parser_f中的 /usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py(filepath_or_buffer,sep,delimiter,header,names,index_col,usecols,squeeze,prefix,mangle_dupe_cols,dtype,引擎,转换器,true_values,false_values,skipinitialspace,skiprows,nrows,na_values,keep_default_na,na_filter,verbose,skip_blank_lines,parse_dates,infer_datetime_format,keep_date_col,date_parser,dayfirst,iterator,chunksize,compression,thousands,decimal,lineterminator,quotechar,quoting, escapechar,comment,encoding,dialect,tupleize_cols,error_bad_lines,warn_bad_lines,skipfooter,skip_footer,doublequote,delim_whitespace,as_recarray,compact_ints,use_unsigned,low_memory,buffer_lines,memory_map,float_precision)     707 skip_blank_lines = skip_blank_lines)     708 - > 709 return _read(filepath_or_buffer,kwds)     710     711 parser_f。 name = name

_read中的/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py(filepath_or_buffer,kwds)     453     454尝试: - > 455 data = parser.read(nrows)     最后456:     457 parser.close()

/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py阅读(self,nrows)    1067引发ValueError(' skipfooter不支持迭代')    1068 - > 1069 ret = self._engine.read(nrows)    1070    1071 if self.options.get(' as_recarray'):

/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py阅读(self,nrows)    1837 def读(self,nrows = None):    1838尝试: - > 1839 data = self._reader.read(nrows)    1840除StopIteration外:    1841年如果是self._first_chunk:

pandas._libs.parsers.TextReader.read()中的pandas / _libs / parsers.pyx

pandas._libs.parsers.TextReader._read_low_memory()

中的pandas / _libs / parsers.pyx

pandas._libs.parsers.TextReader._read_rows()中的pandas / _libs / parsers.pyx

pandas._libs.parsers.TextReader._tokenize_rows()中的pandas / _libs / parsers.pyx

pandas._libs.parsers.raise_parser_error()中的pandas / _libs / parsers.pyx

ParserError:标记数据时出错。 C错误:第19行预期有1个字段,见2

那么我应该怎么做才能在google colab上打开.csv文件

2 个答案:

答案 0 :(得分:0)

将您的文件添加到Google云端硬盘并尝试此操作

!pip install -U -q PyDrive

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()

gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

file_list = drive.ListFile({'q': "'<FOLDER ID>' in parents and trashed=false"}).GetList()
for file1 in file_list:
  print('title: %s, id: %s' % (file1['title'], file1['id']))


title: train.csv, id: <TRAIN_FILE_ID>
title: test.csv, id: <TEST_FILE_ID>

train_downloaded = drive.CreateFile({'id': '<TRAIN_FILE_ID>'})
train_downloaded.GetContentFile('train.csv')
test_downloaded = drive.CreateFile({'id': '<TEST_FILE_ID>'})
test_downloaded.GetContentFile('test.csv')  

import pandas as pd
import numpy as np
df_train = pd.read_csv('train.csv')
df_train

请参阅链接了解更多详情

http://nali.org/load-google-drive-csv-panda-dataframe-google-colab/

答案 1 :(得分:0)

您不需要StringIO,test.csv文件已经在那里上传了。

import pandas as pd
from google.colab import files
uploaded = files.upload()

df = pd.read_csv('test.csv')