Question

我正在使用网络抓取工具（Parsehub）提取数据。提取完成后，Parsehub将有关此数据的信息（以JSON格式）发送到Amazon Lambda，我将其用作Webhook。但是此JSON无法正确转义，因此Lambda引发错误（例如，无法解析请求正文）。如何转义JSON字符串，以便Lambda不会引发错误？我还使用eclipse测试了此功能。

我已经使用简单的Java类型作为输入（https://docs.aws.amazon.com/lambda/latest/dg/java-programming-model-req-resp.html）。我还尝试过使用POJO（https://docs.aws.amazon.com/lambda/latest/dg/java-handler-io-type-pojo.html）和字节流实现（https://docs.aws.amazon.com/lambda/latest/dg/java-handler-io-type-stream.html）作为输入，但是它仍然会引发json解析错误。

这是我的Lambda处理程序代码的一部分：

public class LambdaFunctionHandler implements RequestHandler<Object, String> {

    @Override
    public String handleRequest(Object input, Context context) {
        System.out.println("input - " + input);
        return "response";
    }
}

这是JSON，Parsehub正在发送给Lambda：

{
    "run_token": "I have removed this",
    "status": "complete",
    "md5sum": "90dc9753513a248502414e8d5345a6de /phfiles/ty6qie7-ut5C.gz ",
    "custom_proxies": "",
    "data_ready": 1,
    "template_pages": {},
    "start_time": "2019-01-30T11:01:58",
    "owner_email": "I have removed this",
    "webhook": "https://api endpoint of lambda function",
    "is_empty": false,
    "project_token": "I have removed this",
    "end_time": "2019-01-30T11:02:19",
    "start_running_time": "2019-01-30T11:01:59",
    "options_json": "{"recoveryRules": "{}", "rotateIPs": false, "sendEmail": true, "allowPerfectSimulation": false, "ignoreDisabledElements": true, "webhook": "https://api endpoint of lambda function", "outputType": "csv", "customProxies": "", "preserveOrder": false, "startTemplate": "main_template", "allowReselection": false, "proxyDisableAdblock": false, "proxyCustomRotationHybrid": false, "maxWorkers": "0", "loadJs": true, "startUrl": "https://address of the website from which data is extracted", "startValue": "{}", "maxPages": "0", "proxyAllowInsecure": false}",
    "start_value": "{}",
    "start_template": "main_template",
    "pages": 2,
    "start_url": "https://address of the website from which data is extracted"
}

这是我的Cloudwatch日志中的输出：

Lambda invocation failed with status: 400. Lambda request id: eecd695e-61e7-47d9-bc27-04628c99e158
Execution failed: Could not parse request body into json: Unrecognized token 'run_token': was expecting ('true', 'false' or 'null')
at [Source: [B@36f6b2e9; line: 1, column: 11]

这是我的Eclipse控制台中的输出：

Invoking function...
==================== INVOCATION ERROR ====================
com.amazonaws.services.lambda.model.InvalidRequestContentException: Could not parse request body into json: Unexpected character ('r' (code 114)): was expecting comma to separate Object entries
at [Source: [B@1ade7b2b; line: 15, column: 21] (Service: AWSLambda; Status Code: 400; Error Code: InvalidRequestContentException; Request ID: b46bf0b4-4bb2-4bc0-aa13-81457349153c)

我们可以看到“ options_json”：“ {” recoveryRules“：” {}“， ....... JSON的一部分没有被转义。更改parsehub发送的json。我只能对Lambda进行数据处理。

Answer 1

参加聚会可能为时已晚。但是我有这个问题，我的结论是：

API网关可以管理两个不同的协议。他们称它们为REST和HTTP
HTTP协议具有“路由”。每条路线都有一个有效载荷格式版本的集成方法
在以最简单的方式设计Webhook时，大多数事情都是自以为是的，因此您可以使用默认的包罗万象的路由和有效负载格式v2.0在API网关和lambda之间进行无缝集成

这将导致所有请求都作为一个大JSON对象直接转到lambda事件。标头，requestContext，正文... 正文未反序列化，它只是此大JSON的'body'属性的有效负载，采用转义的字符串格式。

因此，在到达lambda函数时，您必须相应地对其进行处理以反序列化它并获取一个对象。如果是Node.js lambda，则应执行

exports.handler = async (bigEvent, context) => {
    // Deserializing just the body
    event = JSON.parse(bigEvent.body);
    console.log('value1 =', event.key1);
    return event.key1; 
};

为澄清起见，我会说bigEvent类似于

{
  version: '2.0',
  routeKey: 'POST /endpoint',
  rawPath: '/endpoint',
  rawQueryString: '',
  headers: {
    accept: '*/*',
    ...
  },
  requestContext: {
    accountId: '123456789012',
    ....
  },
  body: '{\n    "key1": "importantDatum",\n    "key2": "..."\n}',
  isBase64Encoded: false
}

如果您想使用JSON进行响应，则应在发送之前（使用JSON.stringify(...)）对它进行序列化

如何解析lambda函数作为输入接收的未转义json？

1 个答案: