Question

我有一个像这样的csv文件

Category    Subcategory
-----------------------
cat         panther
cat         tiger
dog         wolf
dog         heyena
cat         lion
dog         beagle

我正在尝试编写一个输出类似内容的脚本（顺序并不重要）：

animals = [
              [['cat'], ['panther', 'tiger', 'lion']],
              [['dog'], ['wolf', 'heyena', 'beagle']]
          ]

到目前为止，我能够列出唯一的类别和唯一子类别列表。

for p in infile:
    if(p[0] not in catlist):
        catlist.append(p[0])
    if(p[1] not in subcatlist) :
        subcatlist.append(p[1])

但是我在编写说“＃34;如果类别＆＃39; cat＆＃39;是在动物[]，但是＆＃39;豹＆＃39;不在猫的附近。＆＃34;

我玩过zip（）和dict（）一些，但我几乎只是在这里喋喋不休。相当新的python。使用Python 3。

Answer 1

如果要将键映射到某些值，使用词典会容易得多。建造它们特别方便defaultdict。

假设您的infile将输入行拆分为空白，则以下内容应该有所帮助：

from collections import defaultdict

animals = defaultdict(list)

for p in infile:
    animals[p[0]].append(p[1])

Answer 2

您可以考虑使用套装和字典。使用类别名称作为字典的键。因此，对于每个p in infile，animals[p[0]].add(p[1])，假设p0，p1是类型和物种。

这样做的好处是，如果'Panther'作为'Cat'出现多次，您不必检查它是否已存在于'Cat'列表中，因为设置类型将确保您拥有一套独特的元素。

>>> from collections import defaultdict
>>> animals = defaultdict(set)
>>> animals['Cat'].add('Panther')
>>> animals
defaultdict(<class 'set'>, {'Cat': {'Panther'}})
>>> animals['Cat'].add('Lion')
>>> animals
defaultdict(<class 'set'>, {'Cat': {'Lion', 'Panther'}})
>>> animals['Cat'].add('Panther')
>>> animals
defaultdict(<class 'set'>, {'Cat': {'Lion', 'Panther'}})

与使用list相比：

>>> moreanimals = defaultdict(list)
>>> moreanimals['Cat'].append('Panther')
>>> moreanimals
defaultdict(<class 'list'>, {'Cat': ['Panther']})
>>> moreanimals['Cat'].append('Panther')
>>> moreanimals
defaultdict(<class 'list'>, {'Cat': ['Panther', 'Panther']})

python 3 csv数据结构问题

2 个答案: