Question

我有2个csv文件。第一个是数据文件，另一个是映射文件。 Mapping文件有4列：Device_Name，GDN，Device_Type和Device_OS。数据文件中存在相同的列。

数据文件包含填充了Device_Name列且其他三列为空白的数据。所有四列都填充在Mapping文件中。我希望我的Python代码打开文件和数据文件中的每个Device_Name，从映射文件中映射其GDN，Device_Type和Device_OS值。

我知道当只有2列存在时如何使用dict（需要映射1个）但我不知道如何在需要映射3列时完成此操作。

以下是我尝试完成Device_Type：

映射的代码

x = dict([])
with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1:
    file_map = csv.reader(in_file1, delimiter=',')
    for row in file_map:
       typemap = [row[0],row[2]]
       x.append(typemap)

with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file:
    writer = csv.writer(out_file, delimiter=',')
    for row in csv.reader(in_file2, delimiter=','):
         try:
              row[27] = x[row[11]]
         except KeyError:
              row[27] = ""
         writer.writerow(row)

返回Atribute Error。

经过一番研究，我意识到我需要创建一个嵌套的dict，但我不知道如何做到这一点。请帮我解决这个问题，或者按照正确的方向推动我解决这个问题。

Answer 1

嵌套dict是字典中的字典。一件非常简单的事情。

>>> d = {}
>>> d['dict1'] = {}
>>> d['dict1']['innerkey'] = 'value'
>>> d
{'dict1': {'innerkey': 'value'}}

您还可以使用defaultdict包中的collections来帮助创建嵌套词典。

>>> import collections
>>> d = collections.defaultdict(dict)
>>> d['dict1']['innerkey'] = 'value'
>>> d  # currently a defaultdict type
defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}})
>>> dict(d)  # but is exactly like a normal dictionary.
{'dict1': {'innerkey': 'value'}}

您可以随意填充。

我建议您在代码中喜欢以下内容：

d = {}  # can use defaultdict(dict) instead

for row in file_map:
    # derive row key from something 
    # when using defaultdict, we can skip the next step creating a dictionary on row_key
    d[row_key] = {} 
    for idx, col in enumerate(row):
        d[row_key][idx] = col

根据您的comment：

可能是上面的代码混淆了这个问题。我的问题简而言之：我有2个文件a.csv b.csv，a.csv有4列i j k l，b.csv也有这些专栏。我是这些csvs的关键专栏。 j k l专栏在a.csv中为空，但在b.csv中填充。我想映射j k的值 l使用'i`作为b.csv到a.csv文件的关键列的列

我的建议是喜欢这个（不使用defaultdict）：

a_file = "path/to/a.csv"
b_file = "path/to/b.csv"

# read from file a.csv
with open(a_file) as f:
    # skip headers
    f.next()
    # get first colum as keys
    keys = (line.split(',')[0] for line in f) 

# create empty dictionary:
d = {}

# read from file b.csv
with open(b_file) as f:
    # gather headers except first key header
    headers = f.next().split(',')[1:]
    # iterate lines
    for line in f:
        # gather the colums
        cols = line.strip().split(',')
        # check to make sure this key should be mapped.
        if cols[0] not in keys:
            continue
        # add key to dict
        d[cols[0]] = dict(
            # inner keys are the header names, values are columns
            (headers[idx], v) for idx, v in enumerate(cols[1:]))

请注意，对于解析csv文件，有一个csv module。

Answer 2

更新：对于任意长度的嵌套词典，请转到this answer。

使用集合中的defaultdict函数。

高性能：“如果密钥不在dict中”，当数据集很大时非常昂贵。

低维护：使代码更易读，并且可以轻松扩展。

from collections import defaultdict

target_dict = defaultdict(dict)
target_dict[key1][key2] = val

Answer 3

对于任意级别的嵌套：

In [2]: def nested_dict():
   ...:     return collections.defaultdict(nested_dict)
   ...:

In [3]: a = nested_dict()

In [4]: a
Out[4]: defaultdict(<function __main__.nested_dict>, {})

In [5]: a['a']['b']['c'] = 1

In [6]: a
Out[6]:
defaultdict(<function __main__.nested_dict>,
            {'a': defaultdict(<function __main__.nested_dict>,
                         {'b': defaultdict(<function __main__.nested_dict>,
                                      {'c': 1})})})

Answer 4

重要的是要记住在使用defaultdict和类似嵌套的dict模块（如nested_dict）时，查找不存在的密钥可能会无意中在dict中创建一个新的密钥条目并导致很多破坏。这是一个带有nested_dict的Python3示例。

import nested_dict as nd
nest = nd.nested_dict()
nest['outer1']['inner1'] = 'v11'
nest['outer1']['inner2'] = 'v12'
print('original nested dict: \n', nest)
try:
    nest['outer1']['wrong_key1']
except KeyError as e:
    print('exception missing key', e)
print('nested dict after lookup with missing key.  no exception raised:\n', nest)

# instead convert back to normal dict
nest_d = nest.to_dict(nest)
try:
    print('converted to normal dict. Trying to lookup Wrong_key2')
    nest_d['outer1']['wrong_key2']
except KeyError as e:
    print('exception missing key', e)
else:
    print(' no exception raised:\n')
# or use dict.keys to check if key in nested dict.
print('checking with dict.keys')
print(list(nest['outer1'].keys()))
if 'wrong_key3' in list(nest.keys()):

    print('found wrong_key3')
else:
    print(' did not find wrong_key3')

输出是：

original nested dict:   {"outer1": {"inner2": "v12", "inner1": "v11"}}

nested dict after lookup with missing key.  no exception raised:  
{"outer1": {"wrong_key1": {}, "inner2": "v12", "inner1": "v11"}} 

converted to normal dict. 
Trying to lookup Wrong_key2 

exception missing key 'wrong_key2' 

checking with dict.keys 

['wrong_key1', 'inner2', 'inner1']  
did not find wrong_key3

Answer 5

如果您要创建一个给定路径列表（任意长度）的嵌套字典，并对路径末尾可能存在的项目执行功能，此方便的小递归功能非常有用：

from pyspark.sql.functions import *
from pyspark.sql.types import *

data = ['15860461.48']
df = spark.createDataFrame(data, StringType())

df.show(truncate=False)

df2 = df.withColumn('value', col('value').cast('decimal(36, 12)'))
df2.show(truncate=False)

+-----------+
|value      |
+-----------+
|15860461.48|
+-----------+

+---------------------+
|value                |
+---------------------+
|15860461.480000000000|
+---------------------+

示例：

def ensure_path(data, path, default=None, default_func=lambda x: x):
    """
    Function:

    - Ensures a path exists within a nested dictionary

    Requires:

    - `data`:
        - Type: dict
        - What: A dictionary to check if the path exists
    - `path`:
        - Type: list of strs
        - What: The path to check

    Optional:

    - `default`:
        - Type: any
        - What: The default item to add to a path that does not yet exist
        - Default: None

    - `default_func`:
        - Type: function
        - What: A single input function that takes in the current path item (or default) and adjusts it
        - Default: `lambda x: x` # Returns the value in the dict or the default value if none was present
    """
    if len(path)>1:
        if path[0] not in data:
            data[path[0]]={}
        data[path[0]]=ensure_path(data=data[path[0]], path=path[1:], default=default, default_func=default_func)
    else:
        if path[0] not in data:
            data[path[0]]=default
        data[path[0]]=default_func(data[path[0]])
    return data

Answer 6

<块引用>

这个东西是空的嵌套列表，ne 会将数据附加到空的字典中

ls = [['a','a1','a2','a3'],['b','b1','b2','b3'],['c','c1','c2','c3'], 
['d','d1','d2','d3']]

这意味着在 data_dict 中创建四个空字典

data_dict = {f'dict{i}':{} for i in range(4)}
for i in range(4):
    upd_dict = {'val' : ls[i][0], 'val1' : ls[i][1],'val2' : ls[i][2],'val3' : ls[i][3]}

    data_dict[f'dict{i}'].update(upd_dict)

print(data_dict)

输出

{'dict0': {'val': 'a', 'val1': 'a1', 'val2': 'a2', 'val3': 'a3'}, 'dict1': {'val': 'b', 'val1': 'b1', 'val2': 'b2', 'val3': 'b3'},'dict2': {'val'：'c'，'val1'：'c1'，'val2'：'c2'，'val3'：'c3'}，'dict3'：{'val'：'d'，'val1' : 'd1', 'val2': 'd2', 'val3': 'd3'}}

Answer 7

#in jupyter
import sys
!conda install -c conda-forge --yes --prefix {sys.prefix} nested_dict 
import nested_dict as nd
d = nd.nested_dict()

'd' 现在可以用于存储嵌套的键值对。

Answer 8

travel_log = {
    "France" : {"cities_visited" : ["paris", "lille", "dijon"], "total_visits" : 10},
    "india" : {"cities_visited" : ["Mumbai", "delhi", "surat",], "total_visits" : 12}
}

Answer 9

dmin()

<body>
  <div class="parent bg-gray-900 flex flex-col h-screen text-gray-100 w-screen">
    <header class="bg-gray-800 p-4 flex border-b border-solid border-gray-600">
      <span class="flex flex-1">
        <a href="#" class="text-gray-100 flex">
          Header 1
        </a>
        <span class="text-right flex-1 mr-4">
          Header 2
        </span>
      </span>
    </header>
    <div class="main flex-1 flex">
      <div class="nav-bar bg-gray-800 w-60 flex-none">navbar</div>
      <div class="content flex-1">
                <div class="flex w-full h-full flex-col">
            <div class="header w-full flex bg-gray-700">
                <div class="w-full ml-6 flex flex-row">
                  <a href="#" class="py-4 pl-4 pr-4 text-gray-100 hover:text-gray-300 hover:bg-gray-600" >
                    <div>
                      item title
                    </div>
                  </a>
                  <a href="#" class="py-4 pl-4 pr-4 text-gray-100 hover:text-gray-300 hover:bg-gray-600 border-solid border-blue-500 border-b-2 text-blue-500" >
                    <div>
                      item title
                    </div>
                  </a>
                </div>
            </div>
            <div className="content w-full flex-1 flex">
              <div class="flex-1 flex overflow-x-scroll">
                <div class="align-middle border-b border-gray-200 flex flex-1" style="">
                  <table class="divide-y divide-gray-200 min-w-full" role="table">
                    <thead class="">
                      <tr role="row">
                        <th class="px-6 py-3 bg-gray-50 text-left text-xs leading-4 font-medium text-gray-500 uppercase tracking-wider"
                            colspan="1" role="columnheader">Col 1</th>
                        <th class="px-6 py-3 bg-gray-50 text-left text-xs leading-4 font-medium text-gray-500 uppercase tracking-wider"
                            colspan="1" role="columnheader">Col2</th>
                        <th class="px-6 py-3 bg-gray-50 text-left text-xs leading-4 font-medium text-gray-500 uppercase tracking-wider"
                            colspan="1" role="columnheader">ColXXX</th>
                        <th class="px-6 py-3 bg-gray-50 text-left text-xs leading-4 font-medium text-gray-500 uppercase tracking-wider"
                            colspan="1" role="columnheader">ColXXX</th>
                        <th class="px-6 py-3 bg-gray-50 text-left text-xs leading-4 font-medium text-gray-500 uppercase tracking-wider"
                            colspan="1" role="columnheader">ColXXX</th>
                        <th class="px-6 py-3 bg-gray-50 text-left text-xs leading-4 font-medium text-gray-500 uppercase tracking-wider"
                            colspan="1" role="columnheader">ColXXX</th>
                        <th class="px-6 py-3 bg-gray-50 text-left text-xs leading-4 font-medium text-gray-500 uppercase tracking-wider"
                            colspan="1" role="columnheader">ColXXX</th>
                        <th class="px-6 py-3 bg-gray-50 text-left text-xs leading-4 font-medium text-gray-500 uppercase tracking-wider"
                            colspan="1" role="columnheader">ColXXX</th>
                        <th class="px-6 py-3 bg-gray-50 text-left text-xs leading-4 font-medium text-gray-500 uppercase tracking-wider"
                            colspan="1" role="columnheader">ColXXX</th>
                      </tr>
                    </thead>
                    <tbody class="bg-white divide-y divide-gray-200" role="rowgroup">
                      <tr role="row">
                        <td class="px-6 py-4 whitespace-no-wrap text-sm leading-5 font-medium text-gray-900" role="cell">XXX
                        </td>
                        <td class="px-6 py-4 whitespace-no-wrap text-sm leading-5 font-medium text-gray-900" role="cell">
                          PERCENTAGE</td>
                        <td class="px-6 py-4 whitespace-no-wrap text-sm leading-5 font-medium text-gray-900" role="cell">10
                        </td>
                        <td class="px-6 py-4 whitespace-no-wrap text-sm leading-5 font-medium text-gray-900" role="cell">0
                        </td>
                        <td class="px-6 py-4 whitespace-no-wrap text-sm leading-5 font-medium text-gray-900" role="cell">0
                        </td>
                        <td class="px-6 py-4 whitespace-no-wrap text-sm leading-5 font-medium text-gray-900" role="cell">10
                        </td>
                        <td class="px-6 py-4 whitespace-no-wrap text-sm leading-5 font-medium text-gray-900" role="cell">10
                        </td>
                        <td class="px-6 py-4 whitespace-no-wrap text-sm leading-5 font-medium text-gray-900" role="cell">10
                        </td>
                        <td class="px-6 py-4 whitespace-no-wrap text-sm leading-5 font-medium text-gray-900" role="cell">5
                        </td>
                      </tr>
                    </tbody>
                  </table>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
</body>

参考文献：

你如何在Python中创建嵌套的dict？

9 个答案: