我有一个Kinesis流,我创建了firehose传递流并将所有数据保存到s3,它正确地保存在每小时的文件夹中。然后,我编写了firehose转换lambda,在将所有消息部署到同一文件夹后,我不确定自己丢失了什么。我的lambda函数响应中包含以下字段:
result.put("recordId", record.getRecordId());
result.put("result", "Ok");
result.put("approximateArrivalEpoch", record.getApproximateArrivalEpoch());
result.put("approximateArrivalTimestamp",record.getApproximateArrivalTimestamp());
result.put("kinesisRecordMetadata", record.getKinesisRecordMetadata());
result.put("data", Base64.getEncoder().encodeToString(jsonData.getBytes()));
编辑:
这是我在Java中的代码。我正在使用KinesisFirehoseEvent,我的情况不需要解码,并且在KinesisFirehoseEvent中得到了ByteBuffer
public JSONObject handler(KinesisFirehoseEvent kinesisFirehoseEvent, Context context) {
final LambdaLogger logger = context.getLogger();
final JSONArray resultArray = new JSONArray();
for (final KinesisFirehoseEvent.Record record: kinesisFirehoseEvent.getRecords()) {
final byte[] data = record.getData().array();
final Optional<TestData> testData = deserialize(data, logger);
if (testData.isPresent()) {
final JSONObject jsonObj = new JSONObject();
final String jsonData = gson.toJson(testData.get());
jsonObj.put("recordId", record.getRecordId());
jsonObj.put("result", "Ok");
jsonObj.put("approximateArrivalEpoch", record.getApproximateArrivalEpoch());
jsonObj.put("approximateArrivalTimestamp", record.getApproximateArrivalTimestamp());
jsonObj.put("kinesisRecordMetadata", record.getKinesisRecordMetadata());
jsonObj.put("data", Base64.getEncoder().encodeToString
(jsonData.getBytes()));
resultArray.add(jsonObj);
}
else {
logger.log("testData not deserialized");
}
}
final JSONObject jsonFinalObj = new JSONObject();
jsonFinalObj.put("records", resultArray);
return jsonFinalObj;
}
答案 0 :(得分:0)
lambda函数返回数据的格式不正确,
查看以下示例,
'use strict';
console.log('Loading function');
/* Stock Ticker format parser */
const parser = /^\{\"TICKER_SYMBOL\"\:\"[A-Z]+\"\,\"SECTOR\"\:"[A-Z]+\"\,\"CHANGE\"\:[-.0-9]+\,\"PRICE\"\:[-.0-9]+\}/;
exports.handler = (event, context, callback) => {
let success = 0; // Number of valid entries found
let failure = 0; // Number of invalid entries found
let dropped = 0; // Number of dropped entries
/* Process the list of records and transform them */
const output = event.records.map((record) => {
const entry = (new Buffer(record.data, 'base64')).toString('utf8');
let match = parser.exec(entry);
if (match) {
let parsed_match = JSON.parse(match);
var milliseconds = new Date().getTime();
/* Add timestamp and convert to CSV */
const result = `${milliseconds},${parsed_match.TICKER_SYMBOL},${parsed_match.SECTOR},${parsed_match.CHANGE},${parsed_match.PRICE}`+"\n";
const payload = (new Buffer(result, 'utf8')).toString('base64');
if (parsed_match.SECTOR != 'RETAIL') {
/* Dropped event, notify and leave the record intact */
dropped++;
return {
recordId: record.recordId,
result: 'Dropped',
data: record.data,
};
}
else {
/* Transformed event */
success++;
return {
recordId: record.recordId,
result: 'Ok',
data: payload,
};
}
}
else {
/* Failed event, notify the error and leave the record intact */
console.log("Failed event : "+ record.data);
failure++;
return {
recordId: record.recordId,
result: 'ProcessingFailed',
data: record.data,
};
}
/* This transformation is the "identity" transformation, the data is left intact
return {
recordId: record.recordId,
result: 'Ok',
data: record.data,
} */
});
console.log(`Processing completed. Successful records ${output.length}.`);
callback(null, { records: output });
};
下面的文档可以帮助您获取有关数据返回格式的更多详细信息,
https://aws.amazon.com/blogs/compute/amazon-kinesis-firehose-data-transformation-with-aws-lambda/
希望有帮助。
答案 1 :(得分:0)
我仅使用上面的代码进行了此工作,它看起来像流很慢,因此尚未达到新的小时数。