当我运行以下代码时,我收到错误:
import os
import boto3
import pandas as pd
import sys
if sys.version_info[0] < 3:
from StringIO import StringIO # Python 2.x
else:
from io import StringIO # Python 3.x
# get your credentials from environment variables
aws_id = 'XX'
aws_secret = 'YY'
client = boto3.client('s3', aws_access_key_id=aws_id,
aws_secret_access_key=aws_secret)
bucket_name = 'arpbhatnagar'
object_key = 'application_train.csv'
csv_obj = client.get_object(Bucket=bucket_name, Key=object_key)
body = csv_obj['Body']
csv_string = body.read().decode('utf-8')
train = pd.read_csv(StringIO(csv_string))
我收到以下错误:
错误:MemoryError Traceback(大多数 最近的呼叫最后)in() 21 csv_obj = client.get_object(Bucket = bucket_name,Key = object_key) 22 body = csv_obj [&#39; Body&#39;] ---&GT; 23 csv_string = body.read()。decode(&#39; utf-8&#39;) 24 25 train = pd.read_csv(StringIO(csv_string),low_memory = True,engine =&#39; python&#39;)
/usr/lib/python2.7/encodings/utf_8.pyc解码(输入,错误) 14 15 def解码(输入,错误=&#39;严格&#39;): ---&GT; 16返回codecs.utf_8_decode(输入,错误,True) 17 18类IncrementalEncoder(codecs.IncrementalEncoder):
的MemoryError:
答案 0 :(得分:0)
下载或摄取application_train.csv
时,您的内存似乎已经不足。要解决该问题,您可以先将文件下载到磁盘,然后将文件名提供给Pandas:
tmp_filename = "/tmp/application_train.csv"
client.download_file(bucket_name, object_key, tmp_filename)
training_set = pd.read_csv(tmp_filename)