我在S3中有一个Redshift的卸载文件:
query = ("unload ('select * from " + str(table) + "')" +
"to 's3://crm-files-redshift/" + str(folder) + "_'" +
"credentials 'aws_access_key_id=" + str(username) + ";aws_secret_access_key=" + str(password) +
"' delimiter ',' addquotes escape allowoverwrite; commit;")
在S3上,我用pymysql连接到mysql:
conn = pymysql.connect(host=host, port=port, user=db_user, passwd=db_pass, db=schema, autocommit = True, connect_timeout=36000,
local_infile=True, max_allowed_packet=1000*1024*1024-1)
cur = conn.cursor()
然后将文件加载到mysql我试图这样做:
load data local infile
'path to file'
into table mytable
FIELDS terminated by ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
但此时我的管道出现了错误:
Traceback (most recent call last):
File "/opt/python3/lib/python3.4/site-packages/pymysql/connections.py", line 927, in _write_bytes
self.socket.sendall(data)
BrokenPipeError: [Errno 32] Broken pipe
注意:我是从EC2实例到MySQL RDS实例执行此操作。我没有使用任何类型的线程。 Python 3.4。文件被正确下载到EC2,我甚至将它下载到我的Windows PC上,并通过MySQL Workbench成功地将LOAD加载到同一个DB中。这些文件大约是400mb。
导致此问题的原因是什么?
编辑:我将文件下载到EC2并从那里进行加载。