我希望在Python中使用boto3甚至类似smart_open的东西来逐行读取文件,然后处理每一行(例如清理某些字段),然后将这些行写回S3。关键是内存中没有任何数据。有什么建议?我尝试过使用以下内容但没有成功
into = "s3://"+access_key+":"+secret_key+"@"+bucket+"/Filetoread.csv"
out = "s3://"+access_key+":"+secret_key+"@"+bucket+"/Filetowrite.csv"
def streamline(inputfile, outputfile):
with smart_open.smart_open(inputfile, 'r') as infile, smart_open.smart_open(outputfile, 'w') as outfile:
for line in infile:
outfile.write(line + '\n')
streamline(into, out)