I have a long list of parts that are bulleted in a word doc. I need to turn this into a data table.
Example Input List:
= Fasteners
o Screws
- Machine
+Round Head
+Pan Head
+Flat Head
- Tapping
+Type AB
+Type A
Example Output Table: Thanks for looking!!
Parent |Child |Type |Style
Fasteners|Screws|Machine|Round Head
Fasteners|Screws|Machine|Pan Head
Fasteners|Screws|Machine|Flat Head
Fasteners|Screws|Tapping|Type AB
Fasteners|Screws|Tapping|Type A
etc etc
答案 0 :(得分:1)
假定您可以将项目符号点转换为Python字典(因为如果嵌套的话,这可能是存储所有内容的最佳方法):
import pandas as pd
parts = {
'Fasteners':{
'Screws':{
'Machine':['Round Head','Pan Head','Flat Head'],
'Tapping':['Type AB','Type A']
}
}
}
df_dict = {'Parent': [], 'Child': [], 'Type': [], 'Style': []}
for parent, v1 in parts.items():
for child, v2 in v1.items():
for child_type, v3 in v2.items():
for style in v3:
df_dict['Parent'].append(parent)
df_dict['Child'].append(child)
df_dict['Type'].append(child_type) # Not named type because type is a native Python function
df_dict['Style'].append(style)
df = pd.DataFrame(df_dict)
print(df)
如果您有一个字典,其中每个键是一列,每个值是一个值列表(彼此按顺序排列),则熊猫在创建数据框时效果最好。我在这里所做的是遍历嵌套字典中的每个键和值,以便我可以生成列表,并在必要时重复(以易于理解的方式)。 parts.items()
为字典创建一个迭代器,该迭代器将遍历每个键及其对应的值。输出如下:
Parent Child Type Style
0 Fasteners Screws Machine Round Head
1 Fasteners Screws Machine Pan Head
2 Fasteners Screws Machine Flat Head
3 Fasteners Screws Tapping Type AB
4 Fasteners Screws Tapping Type A