How to convert a nested list into a dataframe?

时间:2019-01-09 22:14:00

标签: python r excel

I have a long list of parts that are bulleted in a word doc. I need to turn this into a data table.

Example Input List:

= Fasteners

    o Screws

        - Machine

            +Round Head

            +Pan Head

            +Flat Head

       - Tapping

            +Type AB

            +Type A

Example Output Table: Thanks for looking!!

Parent   |Child |Type   |Style                               
Fasteners|Screws|Machine|Round Head    
Fasteners|Screws|Machine|Pan Head    
Fasteners|Screws|Machine|Flat Head    
Fasteners|Screws|Tapping|Type AB    
Fasteners|Screws|Tapping|Type A

etc etc

1 个答案:

答案 0 :(得分:1)

假定您可以将项目符号点转换为Python字典(因为如果嵌套的话,这可能是存储所有内容的最佳方法):

import pandas as pd

parts = {  
     'Fasteners':{  
        'Screws':{  
           'Machine':['Round Head','Pan Head','Flat Head'],
           'Tapping':['Type AB','Type A']
        }
     }
}

df_dict = {'Parent': [], 'Child': [], 'Type': [], 'Style': []}
for parent, v1 in parts.items():
    for child, v2 in v1.items():
        for child_type, v3 in v2.items():
            for style in v3:
                df_dict['Parent'].append(parent)
                df_dict['Child'].append(child)
                df_dict['Type'].append(child_type) # Not named type because type is a native Python function
                df_dict['Style'].append(style)

df = pd.DataFrame(df_dict)
print(df)

如果您有一个字典,其中每个键是一列,每个值是一个值列表(彼此按顺序排列),则熊猫在创建数据框时效果最好。我在这里所做的是遍历嵌套字典中的每个键和值,以便我可以生成列表,并在必要时重复(以易于理解的方式)。 parts.items()为字典创建一个迭代器,该迭代器将遍历每个键及其对应的值。输出如下:

      Parent   Child     Type       Style
0  Fasteners  Screws  Machine  Round Head
1  Fasteners  Screws  Machine    Pan Head
2  Fasteners  Screws  Machine   Flat Head
3  Fasteners  Screws  Tapping     Type AB
4  Fasteners  Screws  Tapping      Type A