如何使用Python解析数据多行和多行字符串并将数据提取到JSON文件中

时间:2019-02-22 11:20:12

标签: python json list

我正在尝试将下面的数据添加到JSON文件中。.这只是一个示例,因为我仍在学习如何做。

ITEM    QTY   ID        DESCR   LOCATION                    
item1   3     it111     Gold    Rack11      
item2   10    it222     Silver  Rack22   
item3   6     it333     Red     Rack33      
item4   1     it444     Blue    Rack44 
例如,下面的

我可以添加class和owner值,因为它仅输出单行和单个字符串。但是键详细信息输出由键和值的多行组成,我不确定如何逐行读取并解析为json。

{
     "product": [
        {
         "class":"food",
         "owner":"user1",
        }
     ]
}

最终输出预期如下

{
     "product": [
        {
         "class":"food",
         "owner":"user1",
         "details": [
         {
          "item":"item1",
          "qty":"3",
          "id":"it111",
          "desc":"Gold",
          "loct":"Rack11"
         },
         {
          "item":"item2",
          "qty":"10",
          "id":"it222",
          "desc":"Silver",
          "loct":"Rack22"
         },
         {
          "item":"item3",
          "qty":"6",
          "id":"it333",
          "desc":"Red",
          "loct":"Rack33"
         },
         {
          "item":"item4",
          "qty":"1",
          "id":"it444",
          "desc":"Blue",
          "loct":"Rack44"
         }
        ] 
       }
     ]
}

我的列表如下

product = "class","owner","details"

问题是我不知道如何将细节输出到“ details”上并将其形成到json嵌套结构中。 感谢您的帮助。谢谢


如果输入文本用制表符分隔,则使用csv阅读器是解决方案。它的工作原理如下。根据信息...我尝试将其与另一组相似的输入文本一起应用,这给我一个

错误
ValueError: need more than 4 values to unpack

输入文本的示例集如下

Local Interface   Parent Interface   Chassis Id          Port info    System Name
xe-3/0/4.0        ae31.0             b0:c6:9a:63:80:40   xe-0/0/0.0   host.xsrt1.net
xe-3/0/5.0        ae31.0             b0:c6:9a:63:80:40   xe-0/0/1.0   host.xsrt1.net
xe-3/0/6.0        ae31.0             b0:c6:9a:63:80:40   xe-0/0/2.0   host.xsrt1.net
xe-3/0/7.0        ae31.0             b0:c6:9a:63:80:40   xe-0/0/3.0   host.xsrt1.net
xe-3/0/0.0        ae31.0             b0:c6:9a:63:80:40   xe-0/1/0.0   host.xsrt1.net
xe-3/0/1.0        ae31.0             b0:c6:9a:63:80:40   xe-0/1/1.0   host.xsrt1.net
xe-3/0/2.0        ae31.0             b0:c6:9a:63:80:40   xe-0/1/2.0   host.xsrt1.net
xe-3/0/3.0        ae31.0             b0:c6:9a:63:80:40   xe-0/1/3.0   host.xsrt1.net

不知道为什么...但是可能不是完全以制表符分隔的格式。如果是这种情况,如何将其转换为有效的制表符分隔格式?谢谢

Update1 :对于上述输入,我将其分成以下测试代码行

with open('lldp.csv', 'r', newline='') as csv_file:
   reader = csv.reader(line.replace('  ', ',') for line in csv_file)
   my_list = list(reader)
   pprint(my_list)  

输出如下

[['Local Interface',' Parent Interface',' Chassis Id','','','','','Port 
info','','System Name'],['xe-3/0/4.0','','','','ae31.0','','','','','',' 
b0:c6:9a:63:80:40',' xe-0/0/0.0',' host.jnpr.net'],['xe- 
3/0/5.0','','','','ae31.0','','','','','',' b0:c6:9a:63:80:40',' xe- 
0/0/1.0',' host.jnpr.net'],['xe-3/0/6.0','','','','ae31.0','','','','','',' 
b0:c6:9a:63:80:40',' xe-0/0/2.0',' host.jnpr.net'],['xe- 
3/0/7.0','','','','ae31.0','','','','','',' b0:c6:9a:63:80:40',' xe- 
0/0/3.0',' host.jnpr.net'],['xe-3/0/0.0','','','','ae31.0','','','','','',' 
b0:c6:9a:63:80:40',' xe-0/1/0.0',' host.jnpr.net'],['xe- 
3/0/1.0','','','','ae31.0','','','','','',' b0:c6:9a:63:80:40',' xe- 
0/1/1.0',' host.jnpr.net'],['xe-3/0/2.0','','','','ae31.0','','','','','',' 
b0:c6:9a:63:80:40',' xe-0/1/2.0',' host.jnpr.net'],['xe- 
3/0/3.0','','','','ae31.0','','','','','',' b0:c6:9a:63:80:40',' xe- 
0/1/3.0',' host.jnpr.net']]  

如何从上面删除不需要的''以及如何从第二行开始读取该行(第一行只是标题)。从列表中,我想将其解析为上面指定的json。

我将为上面的问题打开一个新问题,并专注于谢谢上面的输出

2 个答案:

答案 0 :(得分:0)

如果列以制表符分隔,建议您使用 csv阅读器

首先,您创建一个具有"class""owner"值的基本字典,并为"details"创建一个空列表。然后,您逐行解析行并添加各个详细信息。

import csv
import json

out = {
    "product": [
        {
            "class": "food",
            "owner": "user1",
            "details": []
        }
    ]
}

with open("data.csv") as f:
    reader = csv.reader(f, delimiter="\t")
    next(reader) # skip header

    for row in reader:

        detail = {
            "item": row[0],
            "qty" : row[1],
            "id"  : row[2],
            "desc": row[3],
            "loct": row[4]
        }

        out["product"][0]["details"].append(detail)

# now out contains the final dictionary, you can output it like this:
print(json.dumps(out, indent=4))

我不清楚,如果其中只有一个项目,为什么会有"product"作为列表-我想您会在列表中添加更多产品。

答案 1 :(得分:0)

我不确定您如何遍历class和owner的初始列表,但是他会生成您想要的输出:

import pandas as pd
import json


data = [
['item1','3','it111','Gold','Rack11'],
['item2','10','it222','Silver','Rack22'],  
['item3','6','it333','Red','Rack33'],      
['item4','1','it444','Blue','Rack44']]

df = pd.DataFrame(data,columns=['ITEM','QTY','ID','DESCR','LOCATION'])

#Above was so I had the data to work with, but you can read it in with pandas if its an excel or csv file

# df = pd.read_csv('path/to/datafile.csv')

jsonDict = {}
jsonDict["product"] = []
jsonDict["product"].append({})

jsonDict["product"][0]["class"] = "food"
jsonDict["product"][0]["owner"] = "user1"
jsonDict["product"][0]["details"] = []

for i, row in df.iterrows():

    temp_dict = {}
    temp_dict['item'] = row['ITEM'] 
    temp_dict['qty'] = row['QTY'] 
    temp_dict['id_num'] = row['ID'] 
    temp_dict['desc'] = row['DESCR'] 
    temp_dict['loct'] = row['LOCATION'] 

    jsonDict["product"][0]["details"].append(temp_dict)

with open('data.json', 'w') as fp:
    json.dump(jsonDict, fp, indent=3, sort_keys=False)