在AWS Lambda中将CSV文件写入熊猫数据框

时间:2020-08-06 05:32:29

标签: python pandas amazon-web-services aws-lambda

我的文件格式如下:-

[![I am encoding the below file format in cp1252][1]][1] and it works fine




 id "lookupaction"  "relevanceno"   "skipindex" "fname" "lname" "suffix"    "title" "yearjoined"    "email" "phone" "activeind" "created"   "lastchanged"   "changedby"
47028   "A" "5" "0" "Stephen J."    "Salvucci"  ""  "Corporate Counsel" "2008"  "stephen_salvucci@edwards.com"  "949-250-6871"  "0" ""  ""  "44913"
53485   "A" "402"   "0" "Scott" "Rozic" ""  "Assistant General Counsel" ""  "scott.s.rozic@jpmorgan.com"    "212-648-0325"  "1" ""  ""  "47664"
53486   "A" "398"   "0" "Marisol"   "Rubecindo" ""  "Executive Director
50590   "A" "3" "0" "Jonathan"  "Chueng"    ""  "Vice President Counsel
58172   "A" "39"    "0" "Brad"  "Vining"    ""  "Senior Counsel"    ""  ""  ""  "0" ""  ""  "44913"
58173   "A" "40"    "0" "Warren"    "Zeserman"  ""  "Senior Counsel"    ""  ""  ""  "0" ""  ""  "1"
58174   "A" "1" "0" "Joe"   "Jacumin"   ""  "Deputy General Counsel
58175   "A" "1" "0" "Haley" "Peerson"   ""  "Senior Counsel"    ""  ""  "316-200-0338"  "0" ""  ""  "1"
51473   "A" "7" "0" "Kristen Williams"  "Cook"  ""  "Associate General Counsel" "2010"  "kristen.cook@7-eleven.com" ""  "0" ""  ""  "44889"
36558   "A" "2" "0" "Christopher Paul"  "Barr"  ""  "Vice President & Assistant General Counsel"    "2006"  "cbarr@harleysvillegroup.com"   "215-256-5449"  "0" ""  ""  "1"

但是当我将其与aws lambda一起使用时,我使用(utf-8)对其进行了解码,这给了我unicode错误。

我正在尝试读取此csv文件,并将其转换为pandas数据框。 下面是我的代码:-

import json
import pandas as pd
import boto3
import sys
import csv


s3 = boto3.client('s3')
#s3 = boto3.resource('s3')

## Bucket to use
def lambda_handler(event, context):
  

   if event:
        file_obj = event["Records"][0]
        bucketname = str(file_obj['s3']['bucket']['name'])
        filename = str(file_obj['s3']['object']['key'])
        print(filename)
      
        fileObj = s3.get_object(Bucket=bucketname, Key=filename)
        body = fileObj['Body']
        data = body.read().decode('utf-8')
         #   data=csv.reader(data.split('\r\n'))
           # print(data)
        

        df = pd.read_csv(StringIO(data))
  

我找不到在AWS Lambda中创建数据框的正确方法。 请帮忙,因为我已经在这个问题上坚持了很长时间。

0 个答案:

没有答案