比较嵌套字典和列表中的值

时间:2013-04-26 19:32:47

标签: python list dictionary comparison nested-lists

我想比较两个变量(字典和列表)的值。 Dictionary有一个嵌套的构造,所以我必须遍历所有项目。 我发现了简单的解决方案,但我很确定我能以更好的方式做到这一点(使用python)。 在简要说明中,我想查找user_from_database变量中不存在的user_from_client项。

我的解决方案:

#variable containing users from client side
users_from_client = {
  "0": {
    "COL1": "whatever",
    "COL2": "val1",
    "COL3": "whatever",
  },
  "1": {
    "COL1": "whatever",
    "COL2": "val2",
    "COL3": "whatever",
  },
  "3": {
    "COL1": "whatever",
    "COL2": "val3",
    "COL3": "whatever",
  }    
} 

#variable containing users from the database
users_from_database = [
  ["val1"],
  ["val2"],
  ["val5"],
  ["val7"]
]

#This function is used to find element from the nested dictionaries(d)
def _check(element, d, pattern = 'COL2'):
  exist = False
  for k, user in d.iteritems():
    for key, item in user.iteritems():
      if key == pattern and item == element:
        exist = True
  return exist

#Finding which users should be removed from the database  
to_remove = []
for user in users_from_db:
  if not _check(user[0], users_from_agent):
    if user[0] not in to_remove:
      to_remove.append(user[0])

#to_remove list contains: [val5, val7"] 

使用python方法提供相同结果的更好方法是什么? 可能我不必补充说我是python的新手(我假设你能看到上面的代码)。

3 个答案:

答案 0 :(得分:1)

只需使用error-safe dictionary lookup

def _check(element, d, pattern = 'COL2'):
    for user in d.itervalues():
        if user.get(pattern) == element:
            return True
    return False

或作为一个班轮:

def _check(element, d, pattern = 'COL2'):
    return any(user.get(pattern) == element for user in d.itervalues())

或者尝试将整个工作作为一个单一的工作:

#Finding which users should be removed from the database  
to_remove = set(
    name
    for name in users_from_database.itervalues()
    if not any(user.get('COL2') == name for (user,) in users_from_client)
)

assert to_remove == {"val5", "val7"}

set可以使它更简洁(和更有效):

to_remove = set(
    user for (user,) in users_from_database
) - set(
    user.get('COL2') for user in users_from_client
)

您的数据结构有点奇怪。考虑使用:

users_from_client = [
  {
    "COL1": "whatever",
    "COL2": "val1",
    "COL3": "whatever",
  }, {
    "COL1": "whatever",
    "COL2": "val2",
    "COL3": "whatever",
  }, {
    "COL1": "whatever",
    "COL2": "val3",
    "COL3": "whatever",   
  }
] 

#variable containing users from the database
users_from_database = set(
  "val1",
  "val2",
  "val5",
  "val7"
)

将您的代码缩减为:

to_remove = users_from_database - set(
    user.get('COL2') for user in users_from_client
)

答案 1 :(得分:0)

我不知道有什么超级优雅的方法可以做到这一点,但是你可以对你的代码做一些小的改进。

首先,您没有使用k,因此您可能只会迭代这些值。其次,您不需要跟踪exists,您可以在找到匹配时立即返回。最后,如果您正在检查键值对,则只需测试元组中是否包含元组。

def _check(element, d, pattern = 'COL2'):
  for user in d.itervalues():
    if (pattern, element) in user.items():
      return True
  return False

答案 2 :(得分:0)

您可以创建反向dict以进行快速查找,并将其放在缓存中,例如..

>>> from collections import defaultdict
>>> 
>>> users_inverted = defaultdict(list)
>>> for pk, user in users_from_client.iteritems():
...  for key in user.iteritems():
...   users_inverted[key].append(int(pk))
... 
>>> users_inverted
defaultdict(<type 'list'>, {('COL3', 'whatever'): [1, 0, 3], ('COL2', 'val1'): [0], ('COL1', 'whatever'): [1, 0, 3], ('COL2', 'val2'): [1], ('COL2', 'val3'): [3]})

然后查找用户将非常快:

>>> def _check(element, pattern = 'COL2'):
...  return bool(users_inverted[(pattern, element)])
>>> 
>>> _check('whatever', 'COL3')
True
>>> _check('whatever', 'COL333')
False

除了速度之外,您还可以获得每个属性对的用户列表