我已经设置了以下代码:
def send_to_bq(bigquery, events, table_id):
try:
data = {}
data['rows'] = []
data['skipInvalidRows'] = True # don't drop the entire batch if there's a bad record
data['ignoreUnknownValues'] = True # ignore unknown fields
for event in events:
row = {
'json': event,
# Generate a unique id for each row so retries don't accidentally
# duplicate insert
'insertId': str(uuid.uuid4()),
}
data['rows'].append(row)
if len(data['rows']) > 0:
#print "request: " + json.dumps(data)
return bigquery.tabledata().insertAll(
projectId=config['FUNTOMIC_PROJECTID'],
datasetId=config['DATASET_ID'],
tableId=table_id,
body=data).execute(num_retries=int(config['CHUNK_RETRIES']))
else:
return 'Empty Event'
except Exception as e:
print str(e)
我正在拖尾日志文件并将数据发送到BQ。每隔几次迭代,随机抛出以下异常:
<HttpError 400 when requesting https://www.googleapis.com/bigquery/v2/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID/insertAll?alt=json returned "Parse Error">
有时一天可能会有几次,有时几秒钟。 我不知道发生了什么,我在BQ流媒体文档中找不到任何内容。
我试图了解是否在其中一次重试(我可以安全地忽略)上发生这种情况 - 服务器错误,或者如果仅打印退休已经用尽(在这种情况下我可能会丢失事件) )。
谢谢!
修改1 将我的块大小更改为1,并打印了这样的事件。该事件是有效的JSON。在BQ中验证它没有进入。
{"is_synced": "False", "domain": "kiziland", "server_time": "1457116902", "event_type": "creature_bought", "ip": "151.62.108.127", "partial_data": "True", "agent": "Mozilla/5.0 (Android; U; it-IT) AppleWebKit/533.19.4 (KHTML, like Gecko) AdobeAIR/19.0", "currency": "coins", "elapsed_play_time": "43536", "received_at": "1457116902893", "is_converted": "False", "city": "Trento", "uuid": "tZkUiABW6J5t", "coins_left": 266057442087380840000, "platform": "Android", "is_in_kizi_app": "False", "advertising_id": "f3cb67f6-c631-4a63-bd20-824ca8317eda", "creature_level": "39", "game_version": "1.1.11", "is_in_kizi_mobile_web": "False", "index": "mobile_games", "price": "1.15280492432e+21", "stars_left": "315", "current_max_creature": "49", "event_stream_time": 1457257312.162035, "day": "2016-03-04", "sourcetype": "mobile_events", "original_version": "1.1.11", "is_native": "True", "country": "IT", "install_date": "1453487933", "session_id": "FoKr8DwFAtikNc0X2X0P", "_time": "1457116902", "game_ops_version": "0.7.5", "host_type": "android_native_app", "is_in_kizi_web": "False"}
编辑2 - 解决方案 显然,当我将我的python字典转换为json事件时(需要对某些类型进行特殊处理)我没有处理“长”类型。它们是例外的原因。
答案 0 :(得分:0)
由于您的有效负载请求中的字符无效,看起来BigQuery无法解析请求正文。
如果您尝试验证它[1],您将看到这是一个无效的JSON。看起来问题是找到无效字符的“代理”。
这是提供的代理人:
"agent":"Mozilla/5.0 (Android; U; it-IT) AppleWebKit/533.19.4
(KHTML, like Gecko) AdobeAIR/19.0"
如您所见,AppleWebKit / 533.19.4之后有一个换行符,您需要删除或编码如下:
“agent":"Mozilla/5.0 (Android; U; it-IT) AppleWebKit/533.19.4\n(KHTML, like Gecko) AdobeAIR/19.0”