我正在从pdf文件中读取表格数据,然后将该表格数据转换为数据框,然后转换为字典。我的问题是每次读取字典的表数据键都是不固定的,因为它有时像{'Sno':1,'ItemDescription':'ABC'}之类的键,有时却有{'Sl No':1, '说明':'XYZ'}。我想创建一个具有固定键的新词典,如下所示,其中键是左侧部分,右侧部分是从数据框中提取的键,因此,如果它与列表中的键匹配,则值应映射到新键。 / p>
Srno = ["Sno", "Sl No.", "Order No.", "PO No."]
Productdescription = ["Item Code / Product Description", "Description", "Description of Goods", "Particulars"]
HSNCode = ["HSN / SAC\nCode", "HSN Code", "HSN", "HSN/SAC"]
Quantity = ["Quantity"]
ASIN = ["ASIN"]
ISBN = ["ISBN/EAN/UPC"]
Rate = ["Unit Price\n[INR]", "Rate", "Unit cost", "List price"]
Tax = ["IGST[INR]\nAmount", "Tax rate", "Tax type", "Tax amount"]
Discount = ["Discount", "Disc. %"]
Total = ["Total amount", "Amount", "Total", "Total\n[INR]", "Line Total\n[INR]"]
Model = ["Model #"]
这是从数据框创建的字典示例。
{'item': [{'Sno': 1,
'ItemCodeProductDescription': 'TGMOCL0015CORSAIRMOUSE,M55RGBPRO,PART#CH-9308011-AP',
'HSNSACCode': '8471.60.60',
'Quantity': 7,
'UnitPrice': 1741,
'Total': 12187,
'Rate': 18,
'LineTotal': 14380.66},
{'Sno': 2,
'ItemCodeProductDescription': 'TGMOCL0013CORSAIRMOUSE,HARPOONPRO-BLK-RGB,PART#CH-9301111-AP',
'HSNSACCode': '8471.60.60',
'Quantity': 8,
'UnitPrice': 1200,
'Total': 9600,
'Rate': 18,
'LineTotal': 11328.0},
{'Sno': 3,
'ItemCodeProductDescription': 'TGCBCL0029CORSAIRCABINETSPEC-05,BLK-PART#CC-9011138-WW',
'HSNSACCode': '8473.30.99',
'Quantity': 37,
'UnitPrice': 2225,
'Total': 82325,
'Rate': 18,
'LineTotal': 97143.5},
{'Sno': 4,
'ItemCodeProductDescription': 'TGHSCL0003CORSAIRGAMINGHEADSETHS50StereoCarbonPART#CA-9011170-AP',
'HSNSACCode': '8518.30.00',
'Quantity': 92,
'UnitPrice': 3000,
'Total': 276000,
'Rate': 18,
'LineTotal': 325680.0},
{'Sno': 5,
'ItemCodeProductDescription': 'TGMOCL0001CORSAIRMOUSE,HARPOON-BLK-RGB,PART#CH-9301011-AP',
'HSNSACCode': '8471.60.60',
'Quantity': 43,
'UnitPrice': 1018,
'Total': 43774,
'Rate': 18,
'LineTotal': 51653.32},
{'Sno': 6,
'ItemCodeProductDescription': 'TGKBCL0001CORSAIRKEYBOARDK95PLTN-BLK-MXSpeed-RGBPART#CH-9127014-NA',
'HSNSACCode': '8471.60.40',
'Quantity': 8,
'UnitPrice': 10750,
'Total': 86000,
'Rate': 18,
'LineTotal': 101480.0},
{'Sno': 7,
'ItemCodeProductDescription': 'TGKBCL0007CORSAIRKEYBOARDK55-BLK-RBRDME-RGBPART#CH-9206015-NA',
'HSNSACCode': '8471.60.40',
'Quantity': 14,
'UnitPrice': 2400,
'Total': 33600,
'Rate': 18,
'LineTotal': 39648.0}]}
最后的字典应该是这样的
{'item': [{'Srno': 1,
'ProductDescription': 'TGMOCL0015CORSAIRMOUSE,M55RGBPRO,PART#CH-9308011-AP',
'HSNCode': '8471.60.60',
'Quantity': 7,
'ASIN':Null
'ISBN':Null
'Rate': 1741,
'Discount':Null,
'Model':Null,
'Tax': 18,
'Total': 14380.66}
请提出从旧词典中创建新词典的有效方法。
答案 0 :(得分:-1)
由于在原始字典中禁止在迭代过程中更改键,因此此处仅可通过新字典进行。通过检查是否输入了可能的选项列表,可以确定正确的密钥。
result_list = []
for i in items['item']:
result = {}
for key, value in i.items():
if key in Srno:
result['Srno'] = value
elif key in Productdescription:
result['ProductDescription'] = value
elif key in HSNCode:
result['HSNCode'] = value
elif key in Quantity:
result['Quantity'] = value
elif key in ASIN:
result['ASIN'] = value
elif key in ISBN:
result['ISBN'] = value
elif key in Rate:
result['Rate'] = value
elif key in Tax:
result['Tax'] = value
elif key in Discount:
result['Discount'] = value
elif key in Total:
result['Total'] = value
elif key in Model:
result['Model'] = value
if result:
result_list.append(result.copy())