Question

假设我有大量不同颜色的水果，例如24个蓝色香蕉，12个青苹果，0个蓝色草莓等。我想用Python中的数据结构组织它们，以便于选择和排序。我的想法是将它们放入一个以元组为键的字典中，例如，

{ ('banana',    'blue' ): 24,
  ('apple',     'green'): 12,
  ('strawberry','blue' ): 0,
  ...
}

甚至字典，例如，

{ {'fruit': 'banana',    'color': 'blue' }: 24,
  {'fruit': 'apple',     'color': 'green'}: 12,
  {'fruit': 'strawberry','color': 'blue' }: 0,
  ...
}

我想检索所有蓝色水果或所有颜色的香蕉的列表，或者按照水果的名称对该字典进行排序。有没有办法以干净的方式做到这一点？

将元组作为键的词典可能不是正确的方式来处理这种情况。

欢迎所有建议！

Answer 1

就个人而言，我喜欢python的一个方面是元组 - 字典组合。你在这里有一个2d数组（其中x =水果名称和y =颜色），我通常支持实现2d数组的元组dict，至少在numpy或数据库之类的时候不合适。简而言之，我认为你有一个很好的方法。

请注意，如果不做一些额外的工作，就不能将dicts用作dict中的键，因此这不是一个很好的解决方案。

那就是说，你也应该考虑namedtuple()。这样你就可以做到这一点：

>>> from collections import namedtuple
>>> Fruit = namedtuple("Fruit", ["name", "color"])
>>> f = Fruit(name="banana", color="red")
>>> print f
Fruit(name='banana', color='red')
>>> f.name
'banana'
>>> f.color
'red'

现在你可以使用你的fruitcount dict：

>>> fruitcount = {Fruit("banana", "red"):5}
>>> fruitcount[f]
5

其他技巧：

>>> fruits = fruitcount.keys()
>>> fruits.sort()
>>> print fruits
[Fruit(name='apple', color='green'), 
 Fruit(name='apple', color='red'), 
 Fruit(name='banana', color='blue'), 
 Fruit(name='strawberry', color='blue')]
>>> fruits.sort(key=lambda x:x.color)
>>> print fruits
[Fruit(name='banana', color='blue'), 
 Fruit(name='strawberry', color='blue'), 
 Fruit(name='apple', color='green'), 
 Fruit(name='apple', color='red')]

回应chmullig，要获得一个水果的所有颜色的列表，你必须过滤密钥，即

bananas = [fruit for fruit in fruits if fruit.name=='banana']

Answer 2

您最好的选择是创建一个简单的数据结构来模拟您拥有的内容。然后，您可以将这些对象存储在一个简单的列表中，并按照您希望的方式对它们进行排序/检索。

对于这种情况，我将使用以下类：

class Fruit:
    def __init__(self, name, color, quantity): 
        self.name = name
        self.color = color
        self.quantity = quantity

    def __str__(self):
        return "Name: %s, Color: %s, Quantity: %s" % \
     (self.name, self.color, self.quantity)

然后您可以简单地构造“Fruit”实例并将它们添加到列表中，如下所示：

fruit1 = Fruit("apple", "red", 12)
fruit2 = Fruit("pear", "green", 22)
fruit3 = Fruit("banana", "yellow", 32)
fruits = [fruit3, fruit2, fruit1]

简单列表fruits将更容易，更容易混淆，维护得更好。

一些使用示例：

以下所有输出都是运行给定代码段后跟：

的结果

for fruit in fruits:
    print fruit

未排序的列表：

显示器：

Name: banana, Color: yellow, Quantity: 32
Name: pear, Color: green, Quantity: 22
Name: apple, Color: red, Quantity: 12

按名称按字母顺序排序：

fruits.sort(key=lambda x: x.name.lower())

显示器：

Name: apple, Color: red, Quantity: 12
Name: banana, Color: yellow, Quantity: 32
Name: pear, Color: green, Quantity: 22

按数量排序：

fruits.sort(key=lambda x: x.quantity)

显示器：

Name: apple, Color: red, Quantity: 12
Name: pear, Color: green, Quantity: 22
Name: banana, Color: yellow, Quantity: 32

其中color == red：

red_fruit = filter(lambda f: f.color == "red", fruits)

显示器：

Name: apple, Color: red, Quantity: 12

Answer 3

数据库，dicts字典，字典列表字典，名为元组（它是子类），sqlite，冗余... 我不相信自己的眼睛。还有什么？

“很可能是以元组为键的字典不是处理这种情况的正确方法。”

“我的直觉是，数据库对OP的需求来说太过分了;”

呀！我想

所以，在我看来，元组列表足够了：

from operator import itemgetter

li = [  ('banana',     'blue'   , 24) ,
        ('apple',      'green'  , 12) ,
        ('strawberry', 'blue'   , 16 ) ,
        ('banana',     'yellow' , 13) ,
        ('apple',      'gold'   , 3 ) ,
        ('pear',       'yellow' , 10) ,
        ('strawberry', 'orange' , 27) ,
        ('apple',      'blue'   , 21) ,
        ('apple',      'silver' , 0 ) ,
        ('strawberry', 'green'  , 4 ) ,
        ('banana',     'brown'  , 14) ,
        ('strawberry', 'yellow' , 31) ,
        ('apple',      'pink'   , 9 ) ,
        ('strawberry', 'gold'   , 0 ) ,
        ('pear',       'gold'   , 66) ,
        ('apple',      'yellow' , 9 ) ,
        ('pear',       'brown'  , 5 ) ,
        ('strawberry', 'pink'   , 8 ) ,
        ('apple',      'purple' , 7 ) ,
        ('pear',       'blue'   , 51) ,
        ('chesnut',    'yellow',  0 )   ]


print set( u[1] for u in li ),': all potential colors'
print set( c for f,c,n in li if n!=0),': all effective colors'
print [ c for f,c,n in li if f=='banana' ],': all potential colors of bananas'
print [ c for f,c,n in li if f=='banana' and n!=0],': all effective colors of bananas'
print

print set( u[0] for u in li ),': all potential fruits'
print set( f for f,c,n in li if n!=0),': all effective fruits'
print [ f for f,c,n in li if c=='yellow' ],': all potential fruits being yellow'
print [ f for f,c,n in li if c=='yellow' and n!=0],': all effective fruits being yellow'
print

print len(set( u[1] for u in li )),': number of all potential colors'
print len(set(c for f,c,n in li if n!=0)),': number of all effective colors'
print len( [c for f,c,n in li if f=='strawberry']),': number of potential colors of strawberry'
print len( [c for f,c,n in li if f=='strawberry' and n!=0]),': number of effective colors of strawberry'
print

# sorting li by name of fruit
print sorted(li),'  sorted li by name of fruit'
print

# sorting li by number 
print sorted(li, key = itemgetter(2)),'  sorted li by number'
print

# sorting li first by name of color and secondly by name of fruit
print sorted(li, key = itemgetter(1,0)),'  sorted li first by name of color and secondly by name of fruit'
print

结果

set(['blue', 'brown', 'gold', 'purple', 'yellow', 'pink', 'green', 'orange', 'silver']) : all potential colors
set(['blue', 'brown', 'gold', 'purple', 'yellow', 'pink', 'green', 'orange']) : all effective colors
['blue', 'yellow', 'brown'] : all potential colors of bananas
['blue', 'yellow', 'brown'] : all effective colors of bananas

set(['strawberry', 'chesnut', 'pear', 'banana', 'apple']) : all potential fruits
set(['strawberry', 'pear', 'banana', 'apple']) : all effective fruits
['banana', 'pear', 'strawberry', 'apple', 'chesnut'] : all potential fruits being yellow
['banana', 'pear', 'strawberry', 'apple'] : all effective fruits being yellow

9 : number of all potential colors
8 : number of all effective colors
6 : number of potential colors of strawberry
5 : number of effective colors of strawberry

[('apple', 'blue', 21), ('apple', 'gold', 3), ('apple', 'green', 12), ('apple', 'pink', 9), ('apple', 'purple', 7), ('apple', 'silver', 0), ('apple', 'yellow', 9), ('banana', 'blue', 24), ('banana', 'brown', 14), ('banana', 'yellow', 13), ('chesnut', 'yellow', 0), ('pear', 'blue', 51), ('pear', 'brown', 5), ('pear', 'gold', 66), ('pear', 'yellow', 10), ('strawberry', 'blue', 16), ('strawberry', 'gold', 0), ('strawberry', 'green', 4), ('strawberry', 'orange', 27), ('strawberry', 'pink', 8), ('strawberry', 'yellow', 31)]   sorted li by name of fruit

[('apple', 'silver', 0), ('strawberry', 'gold', 0), ('chesnut', 'yellow', 0), ('apple', 'gold', 3), ('strawberry', 'green', 4), ('pear', 'brown', 5), ('apple', 'purple', 7), ('strawberry', 'pink', 8), ('apple', 'pink', 9), ('apple', 'yellow', 9), ('pear', 'yellow', 10), ('apple', 'green', 12), ('banana', 'yellow', 13), ('banana', 'brown', 14), ('strawberry', 'blue', 16), ('apple', 'blue', 21), ('banana', 'blue', 24), ('strawberry', 'orange', 27), ('strawberry', 'yellow', 31), ('pear', 'blue', 51), ('pear', 'gold', 66)]   sorted li by number

[('apple', 'blue', 21), ('banana', 'blue', 24), ('pear', 'blue', 51), ('strawberry', 'blue', 16), ('banana', 'brown', 14), ('pear', 'brown', 5), ('apple', 'gold', 3), ('pear', 'gold', 66), ('strawberry', 'gold', 0), ('apple', 'green', 12), ('strawberry', 'green', 4), ('strawberry', 'orange', 27), ('apple', 'pink', 9), ('strawberry', 'pink', 8), ('apple', 'purple', 7), ('apple', 'silver', 0), ('apple', 'yellow', 9), ('banana', 'yellow', 13), ('chesnut', 'yellow', 0), ('pear', 'yellow', 10), ('strawberry', 'yellow', 31)]   sorted li first by name of color and secondly by name of fruit

Answer 4

在这种情况下，字典可能不是您应该使用的字典。更全功能的图书馆将是更好的选择。可能是一个真正的数据库。最简单的是sqlite。你可以通过传入字符串'：memory：'而不是文件名来保存整个内存。

如果您确实希望继续沿着此路径前进，可以使用键或值中的额外属性来执行此操作。然而，字典不能成为另一个字典的关键，但元组可以。 The docs解释什么是允许的。它必须是一个不可变对象，它包括仅包含字符串和数字的字符串，数字和元组（以及递归仅包含这些类型的更多元组...）。

您可以使用d = {('apple', 'red') : 4}做第一个示例，但要查询您想要的内容将非常困难。你需要做这样的事情：

#find all apples
apples = [d[key] for key in d.keys() if key[0] == 'apple']

#find all red items
red = [d[key] for key in d.keys() if key[1] == 'red']

#the red apple
redapples = d[('apple', 'red')]

Answer 5

使用键作为元组，您只需使用给定的第二个组件过滤键并对其进行排序：

blue_fruit = sorted([k for k in data.keys() if k[1] == 'blue'])
for k in blue_fruit:
  print k[0], data[k] # prints 'banana 24', etc

排序有效，因为如果元组的组件具有自然顺序，则元组具有自然顺序。

使用键作为相当完整的对象，您只需按k.color == 'blue'进行过滤。

你不能真正使用dicts作为键，但你可以创建一个最简单的类，如class Foo(object): pass，并在运行中添加任何属性：

k = Foo()
k.color = 'blue'

这些实例可以作为dict键，但要注意它们的可变性！

Answer 6

你可以有一个词典，其中的条目是其他词典的列表：

fruit_dict = dict()
fruit_dict['banana'] = [{'yellow': 24}]
fruit_dict['apple'] = [{'red': 12}, {'green': 14}]
print fruit_dict

输出：

{'banana'：[{'yellow'：24}]，'apple'：[{'red'：12}，{'green'：14}]}

编辑：正如eumiro指出的那样，你可以使用字典词典：

fruit_dict = dict()
fruit_dict['banana'] = {'yellow': 24}
fruit_dict['apple'] = {'red': 12, 'green': 14}
print fruit_dict

输出：

{'banana'：{'yellow'：24}，'apple'： {'green'：14，'red'：12}}

Answer 7

您想要独立使用两个键，因此您有两个选择：

以{'banana' : {'blue' : 4, ...}, .... }和{'blue': {'banana':4, ...} ...}两个词组冗余存储数据。然后，搜索和排序很容易，但您必须确保一起修改dicts。

只存储一个dict，然后编写迭代它们的函数，例如：

d = {'banana' : {'blue' : 4, 'yellow':6}, 'apple':{'red':1} }

blueFruit = [(fruit,d[fruit]['blue']) if d[fruit].has_key('blue') for fruit in d.keys()]

Answer 8

这种类型的数据可以从类似Trie的数据结构中有效地提取。它还允许快速排序。但是内存效率可能不是那么好。

传统的trie将单词的每个字母存储为树中的节点。但在你的情况下，你的“字母”是不同的。您正在存储字符串而不是字符。

它可能看起来像这样：

root:                Root
                     /|\
                    / | \
                   /  |  \     
fruit:       Banana Apple Strawberry
              / |      |     \
             /  |      |      \
color:     Blue Yellow Green  Blue
            /   |       |       \
           /    |       |        \
end:      24   100      12        0

请参阅此链接：trie in python

Python：元组/字典作为键，选择，排序

8 个答案: