如何解析包含相似数据的不同JSON模式?

时间:2019-05-09 15:11:26

标签: python json python-3.x oop

因此,假设我有以下两个模式,在这些模式中,我向websocketstream发送消息并接收包含相似数据的消息。

# First Schema
x_sent = {"Product": {"id": "123"}}

x_received = {"properties": {
    "id": {"type": "number"},
    "color": "green"}}

# Second Schema
y_sent = {"Item": {"Product": {"uid": "123"}}}

y_received = {"configs": {
    "id_number": "123",
    "type": "int"},
    "colour": "green"}

如果要区分两个流,可以过滤消息内容:

if msg == "properties":
    use_schema_a()
if msg == "configs":
    use_schema_b()

但是,如果不同模式的数量增加,那不是很干。我也可以这样做:

msg_routing = {"properties": use_schema_a,
               "configs": use_schema_b}

if msg:
    msg_routing[msg]()

但是,我仍然会为每个模式创建函数!我觉得我在概念上想念什么。我很想创建一个通用类来处理消息的发送和接收,并且只在一种配置文件中包含特定于流的过滤数据。

它可能看起来像这样:

{"schemaA": {"name": "service_ABC", "color": "properties.color", "send_id":"Product:id"},
 "schemaB": {"name": "service_DEF", "color": "configs.colour", "send_id":"Item:Product:uid"}}

就像上面的示例一样,我需要的数据是相同的(在此示例中为green)。我需要发送的ID也很相似(在此示例中,123)。

因此,如果我知道我需要发送和接收的数据的架构,该如何动态构建一些可以理解该架构的东西?

为您提供一个清晰的起点示例:

def on_message(received_msg):
    # The unparsed message we receive is something like
    #      {"properties": {
    #     "id": {"type": "number"},
    #     "color": "green"}}

    # Do our message filtering/parsing

    handle_message_contents(service_name, color)

1 个答案:

答案 0 :(得分:2)

首先,您需要使用架构来创建消息。如果您创建这样的模式,例如:

schemas = {
  "service_ABC": {
    "send": {
      "id": ["Product", "id"],
    },
    "receive": {
      "color": ["properties", "color"],
    },  
  },
  "service_DEF": {
    "send": {
      "id": ["Item", "Product", "uid"],
      "cond": ["Item", "Condition"],
    },
    "receive": {
      "color": ["configs", "colour"],
    },
  },
}

然后,您可以使用一种方法,当提供服务名称和正确的参数时,该方法可以构建要发送的数据字典:

def build_request(service, **kwargs):
  request = dict()
  for attribute, path in schemas[service]["send"].items():
    second_to_last_level = request
    last_level = request
    for level in path:
      second_to_last_level = last_level
      last_level = last_level.setdefault(level, dict())
    second_to_last_level[level] = kwargs[attribute]
  return request

这样,您可以添加不同的参数以直接发送到架构中。查看一些示例:

build_request("service_ABC", id="123") == {
  "Product": {
    "id": "123"
  }
}

build_request("service_DEF", id="123", cond="New") == {
  "Item": {
    "Product": {
      "uid": "123"
    },
    "Condition": "New"
  }    
}

接下来,您需要确定消息的来源。最好的方法是在上游某处并将其传递给“模式处理器”。如果您无法在消息旁边获得这些信息(我对此表示怀疑),则可以使用一种建议的方法。

一旦您知道消息来自什么服务(以及使用哪种模式),就可以采用与构建请求类似的方式来处理消息。

def process(service, msg):
  result = dict()
  for attribute, path in schemas[service]["receive"].items():
    value = msg
    for field in path:
      value = value[field]
    result[attribute] = value
  return result

再次,参见示例:

x_received = {
  "properties": {
    "id": {
      "type": "number"
    },
    "color": "green"
  }
}
process("service_ABC", x_received) == {
  "color": "green"
}

如果您真的无法保留service变量以将其传递给process(),那么我认为最好的方法是使用msg_routing的方法。您可以将其作为单独的词典,甚至将其添加到schemas中。另外,您始终可以检查process()是否达到了您的期望,如果没有,请尝试应用下一个模式:

def process(msg):
  for service, schema in schemas.items():
    missing_something = False
    result = dict()
    for attribute, path in schema["receive"].items():
      value = msg
      for field in path:
        if not field in value:
          missing_something = True
          break
        value = value[field]
      if missing_something:
        break
      result[attribute] = value
    if not missing_something:
      return service, result
  raise RuntimeError("No schema applies")