我正在使用网络抓取工具(Parsehub)提取数据。提取完成后,Parsehub将有关此数据的信息(以JSON格式)发送到Amazon Lambda,我将其用作Webhook。但是此JSON无法正确转义,因此Lambda引发错误(例如,无法解析请求正文)。如何转义JSON字符串,以便Lambda不会引发错误?我还使用eclipse测试了此功能。
我已经使用简单的Java类型作为输入(https://docs.aws.amazon.com/lambda/latest/dg/java-programming-model-req-resp.html)。我还尝试过使用POJO(https://docs.aws.amazon.com/lambda/latest/dg/java-handler-io-type-pojo.html)和字节流实现(https://docs.aws.amazon.com/lambda/latest/dg/java-handler-io-type-stream.html)作为输入,但是它仍然会引发json解析错误。
这是我的Lambda处理程序代码的一部分:
public class LambdaFunctionHandler implements RequestHandler<Object, String> {
@Override
public String handleRequest(Object input, Context context) {
System.out.println("input - " + input);
return "response";
}
}
这是JSON,Parsehub正在发送给Lambda:
{
"run_token": "I have removed this",
"status": "complete",
"md5sum": "90dc9753513a248502414e8d5345a6de /phfiles/ty6qie7-ut5C.gz ",
"custom_proxies": "",
"data_ready": 1,
"template_pages": {},
"start_time": "2019-01-30T11:01:58",
"owner_email": "I have removed this",
"webhook": "https://api endpoint of lambda function",
"is_empty": false,
"project_token": "I have removed this",
"end_time": "2019-01-30T11:02:19",
"start_running_time": "2019-01-30T11:01:59",
"options_json": "{"recoveryRules": "{}", "rotateIPs": false, "sendEmail": true, "allowPerfectSimulation": false, "ignoreDisabledElements": true, "webhook": "https://api endpoint of lambda function", "outputType": "csv", "customProxies": "", "preserveOrder": false, "startTemplate": "main_template", "allowReselection": false, "proxyDisableAdblock": false, "proxyCustomRotationHybrid": false, "maxWorkers": "0", "loadJs": true, "startUrl": "https://address of the website from which data is extracted", "startValue": "{}", "maxPages": "0", "proxyAllowInsecure": false}",
"start_value": "{}",
"start_template": "main_template",
"pages": 2,
"start_url": "https://address of the website from which data is extracted"
}
这是我的Cloudwatch日志中的输出:
Lambda invocation failed with status: 400. Lambda request id: eecd695e-61e7-47d9-bc27-04628c99e158
Execution failed: Could not parse request body into json: Unrecognized token 'run_token': was expecting ('true', 'false' or 'null')
at [Source: [B@36f6b2e9; line: 1, column: 11]
这是我的Eclipse控制台中的输出:
Invoking function...
==================== INVOCATION ERROR ====================
com.amazonaws.services.lambda.model.InvalidRequestContentException: Could not parse request body into json: Unexpected character ('r' (code 114)): was expecting comma to separate Object entries
at [Source: [B@1ade7b2b; line: 15, column: 21] (Service: AWSLambda; Status Code: 400; Error Code: InvalidRequestContentException; Request ID: b46bf0b4-4bb2-4bc0-aa13-81457349153c)
我们可以看到“ options_json”:“ {” recoveryRules“:” {}“, ....... JSON的一部分没有被转义。更改parsehub发送的json。我只能对Lambda进行数据处理。
答案 0 :(得分:0)
参加聚会可能为时已晚。但是我有这个问题,我的结论是:
这将导致所有请求都作为一个大JSON对象直接转到lambda事件。标头,requestContext,正文... 正文未反序列化,它只是此大JSON的'body'属性的有效负载,采用转义的字符串格式。
因此,在到达lambda函数时,您必须相应地对其进行处理以反序列化它并获取一个对象。如果是Node.js lambda,则应执行
exports.handler = async (bigEvent, context) => {
// Deserializing just the body
event = JSON.parse(bigEvent.body);
console.log('value1 =', event.key1);
return event.key1;
};
为澄清起见,我会说bigEvent类似于
{
version: '2.0',
routeKey: 'POST /endpoint',
rawPath: '/endpoint',
rawQueryString: '',
headers: {
accept: '*/*',
...
},
requestContext: {
accountId: '123456789012',
....
},
body: '{\n "key1": "importantDatum",\n "key2": "..."\n}',
isBase64Encoded: false
}
如果您想使用JSON进行响应,则应在发送之前(使用JSON.stringify(...)
)对它进行序列化