如何使用python difflib比较具有多个字段的json对象

时间:2019-04-04 21:22:03

标签: python python-3.x difflib

期望的结果是使JSON对象看起来像下面的示例。从那里,difflib将同时处理 DataToCompare name address DataSetAgainst name address 并吐出与JSON out示例类似的差异。

JSON输入:

{
  "DataToCompare": [
    {
        "name": "Alex Young",
        "address": "123 Main Street"
    }
  ],
  "DataSetAgainst": [
    {
        "name": "Bob Doll",
        "address": "555 South Street"
    },
    {
        "name": "Bob Young",
        "adress": "123 Main St."
    }
  ]
}

JSON输出:

[
    {
        "name": "Bob Doll",
        "Name Match": 11.8,
        "address": "555 South Street",
        "address match": <some number>
    },
    {
        "name": "Bob Young",
        "Name Match": 55.6,
        "address": "123 Main St.",
        "address match": <some number>
    }
]

当前,我取回“名称”和“名称匹配”,但还需要获取地址差异。我的python代码如下所示。

def Fuzzy_request(dataIncoming):

data_to_compare = dataIncoming["DataToCompare"][0]
mt_name = data_to_compare['name']

dataList = []
for i in dataIncoming["DataSetAgainst"]:
    dataList.append(i["name"])

dataResults = []

for i in dataList:
    dataBack = {}
    clean_name = ''.join(e for e in i if e.isalnum())
    sequence = difflib.SequenceMatcher(isjunk=None, a=mt_name, b=clean_name)
    difference = sequence.ratio()*100
    difference = round(difference, 1)

    dataBack["name"] = i
    dataBack["Name Match"] = difference

    dataResults.append(dataBack)

return json.dumps(dataResults)

0 个答案:

没有答案