我是GCP的新手,我正在尝试使用Google函数创建一个简单的API,此API需要从Google Storage存储桶中读取CSV并返回json。为此,我可以在本地正常运行,打开一个文件。
但是在Google函数中,我从存储桶中收到了一个Blob,并且不知道如何操作,我收到了错误消息
我尝试将blob转换为Bytes和字符串,但是我不知道该怎么做
代码在我的本地环境中工作
data1 = '2019-08-20'
data1 = datetime.datetime.strptime(data1, '%Y-%m-%d')
data2 = '2019-11-21'
data2 = datetime.datetime.strptime(data2, '%Y-%m-%d')
with open("/home/thiago/mycsvexample.csv", "r") as fin:
#create a CSV dictionary reader object
print(type(fin))
csv_dreader = csv.DictReader(fin)
#iterate over all rows in CSV dict reader
for row in csv_dreader:
#check for invalid Date values
#convert date string to a date object
date = datetime.datetime.strptime(row['date'], '%Y-%m-%d')
#check if date falls within requested range
if date >= data1 and date <= data2:
total = total + float(row['total'])
print(total)
Google Functions中的代码:
import csv, datetime
from google.cloud import storage
from io import BytesIO
def get_orders(request):
"""Responds to any HTTP request.
Args:
request (flask.Request): HTTP request object.
Returns:
The response text or any set of values that can be turned into a
Response object using
`make_response <http://flask.pocoo.org/docs/1.0/api/#flask.Flask.make_response>`.
"""
request_json = request.get_json()
if request.args and 'token' in request.args:
if request.args['token'] == 'mytoken888888':
client = storage.Client()
bucket = client.get_bucket('mybucketgoogle.appspot.com')
blob = bucket.get_blob('mycsvfile.csv')
byte_stream = BytesIO()
blob.download_to_file(byte_stream)
byte_stream.seek(0)
file = byte_stream
#with open(BytesIO(blob), "r") as fin:
#create a CSV dictionary reader object
csv_dreader = csv.DictReader(file)
#iterate over all rows in CSV dict reader
for row in csv_dreader:
#check for invalid Date values
date = datetime.datetime.strptime(row['date'], '%Y-%m-%d')
#check if date falls within requested range
if date >= datetime.datetime.strptime(request.args['start_date']) and date <= datetime.datetime.strptime(request.args['end_date']):
total = total + float(row['total'])
dict = {'total_faturado' : total}
return dict
else:
return f'Passe parametros corretos'
else:
return f'Passe parametros corretos'
Google云功能错误
Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 346, in run_http_function result = _function_handler.invoke_user_function(flask.request) File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 217, in invoke_user_function return call_user_function(request_or_event) File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker.py", line 210, in call_user_function return self._user_function(request_or_event) File "/user_code/main.py", line 31, in get_orders_tramontina for row in csv_dreader: File "/opt/python3.7/lib/python3.7/csv.py", line 111, in __next__ self.fieldnames File "/opt/python3.7/lib/python3.7/csv.py", line 98, in fieldnames self._fieldnames = next(self.reader) _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
我尝试做其他事情,但没有成功...
有人可以帮助我解决这个问题,进行转换或以正确的方式进行操作吗?
谢谢大家
答案 0 :(得分:1)
这是对我有用的代码:
from google.cloud import storage
import csv
client = storage.Client()
bucket = client.get_bucket('source')
blob = bucket.blob('file')
dest_file = '/tmp/file.csv'
blob.download_to_filename(dest_file)
dict = {}
total = 0
with open(dest_file) as fh:
# assuming your csv is del by comma
rd = csv.DictReader(fh, delimiter=',')
for row in rd:
date = datetime.datetime.strptime(row['date'], '%Y-%m-%d')
#check if date falls within requested range
if date >= datetime.datetime.strptime(request.args['start_date']) and date <= datetime.datetime.strptime(request.args['end_date']):
total = total + float(row['total'])
dict['total_faturado'] = total
答案 1 :(得分:0)
尝试将文件下载为字符串,这样您可以检查无效的数据值,并最终将其写入文件。
将blob.download_to_file(byte_stream)
更改为my_blob_str = blob.download_as_string()
我认为您的实际问题是byte_stream = BytesIO()
,因为您的输出显示为iterator should return strings, not bytes (did you open the file in text mode?)
它期望一个字符串,但是得到字节。 byte_stream
的目的是什么?如果随机,则将其删除。
答案 2 :(得分:0)
我也可以使用gcsfs库来做到这一点
https://gcsfs.readthedocs.io/en/latest/
def get_orders_tramontina(request):
"""Responds to any HTTP request.
Args:
request (flask.Request): HTTP request object.
Returns:
The response text or any set of values that can be turned into a
Response object using
`make_response <http://flask.pocoo.org/docs/1.0/api/#flask.Flask.make_response>`.
"""
request_json = request.get_json()
if request.args and 'token' in request.args:
if request.args['token'] == 'mytoken':
fs = gcsfs.GCSFileSystem(project='myproject')
total = 0
with fs.open('mybucket.appspot.com/mycsv.csv', "r") as fin:
csv_dreader = csv.DictReader(fin)
#iterate over all rows in CSV dict reader
for row in csv_dreader:
#check for invalid Date values
date = datetime.datetime.strptime(row['date'], '%Y-%m-%d')
#check if date falls within requested range
if date >= datetime.datetime.strptime(request.args['start_date'], '%Y-%m-%d') and date <= datetime.datetime.strptime(request.args['end_date'], '%Y-%m-%d'):
total = total + float(row['total'])
dict = {'total_faturado' : total}
return json.dumps(dict)```