将字符串转换为多个JSON对象

时间:2018-07-18 05:11:16

标签: python

我在下面的字符串中有多行。对于每一行,我想分割字符串并将其添加到JSON输出文件中。我使用string.gettext().split和正则表达式来完成此操作。但是我不确定这是最好的方法。

输入文件:

Server:prod01
Available memory: 20480      Disk:200     CPU:4
Used memory:12438              Disk:120     CPU:3
Unused memory:8042            Disk:80       CPU:1
Server:prod02
Available memory: 40960      Disk:500     CPU:8
Used memory:20888              Disk:320     CPU:3
Unused memory:20072          Disk:180    CPU:5

预期的输出JSON:

{"prod01_available_memory":20480}
{"prod01_used_memory":12438}
{"prod01_unused_memory":8042}
{"prod01_available_disk":200}
{"prod01_used_disk":120}
{"prod01_unused_disk":80}
{"prod01_available_cpu":4}
{"prod01_used_cpu":3}
{"prod01_unused_cpu":1}
{"prod02_available_memory":40960}
{"prod02_used_memory":20888}
{"prod02_unused_memory":20072"}
{"prod02_available_disk":500"}
{"prod02_used_disk":380}
{"prod02_unused_disk":120}
{"prod02_available_cpu":8}
{"prod02_used_cpu":3}
{"prod02_unused_cpu":5}

谢谢, 临空

下面是我的代码-

def tsplit(string, *delimiters):
    pattern = '|'.join(map(re.escape, delimiters))
    return re.split(pattern, string)


prelist = pre.get_text().splitlines()
server_name = re.split('server|:',prelist[0])[2].strip()
if server_name == 'prod01':
    #print prelist[1]
    prod01_memory_actv = int(re.split('Activated memory|:|Disk|:|CPU|:',prelist[1])[2])
    prod01_Disk_actv = int(re.split('Activated memory|:|Disk|:|CPU|:',prelist[1])[4])
    prod01_CPU_actv = int(re.split('Activated memory|:|Disk|:|CPU|:',prelist[1])[6])
    #print prelist[2]
    prod01_memory_cons = int(re.split('memory consumed|:|Disk|:|CPU|:',prelist[2])[2])
    prod01_Disk_cons = int(re.split('memory consumed|:|Disk|:|CPU|:',prelist[2])[4])
    prod01_CPU_cons = int(re.split('memory consumed|:|Disk|:|CPU|:',prelist[2])[6])
    #print prelist[4]
    prod01_memory_unused = int(re.split('memory unused|:|Disk|:|CPU|:',prelist[4])[2])
    prod01_Disk_unused = int(re.split('memory unused|:|Disk|:|CPU|:',prelist[4])[4])
    prod01_CPU_unused = int(re.split('memory unused|:|Disk|:|CPU|:',prelist[4])[6])
elif server_name == 'prod02':
    #print prelist[1]
    prod02memory_actv = int(re.split('Activated memory|:|Disk|:|CPU|:',prelist[1])[2])
    prod02Disk_actv = int(re.split('Activated memory|:|Disk|:|CPU|:',prelist[1])[4])
    prod02CPU_actv = int(re.split('Activated memory|:|Disk|:|CPU|:',prelist[1])[6])
    #print prelist[2]
    prod02memory_cons = int(re.split('memory consumed|:|Disk|:|CPU|:',prelist[2])[2])
    prod02Disk_cons = int(re.split('memory consumed|:|Disk|:|CPU|:',prelist[2])[4])
    prod02CPU_cons = int(re.split('memory consumed|:|Disk|:|CPU|:',prelist[2])[6])
    #print prelist[4]
    prod02memory_unused = int(re.split('memory unused|:|Disk|:|CPU|:',prelist[4])[2])
    prod02Disk_unused = int(re.split('memory unused|:|Disk|:|CPU|:',prelist[4])[4])
    prod02CPU_unused = int(re.split('memory unused|:|Disk|:|CPU|:',prelist[4])[6])
else
    #assign all varaiables 0

.....

    proc_item["logtime"] = str(t1)
    proc_item["prod01_memory_actv"] = prod01_memory_actv
    proc_item["prod01_Disk_actv"] = prod01_Disk_actv
    proc_item["prod01_CPU_actv"] = prod01_CPU_actv
    ......
    #for all otehr variables...

    proc_data.append(proc_item)
    with open("./proc_"+ str(date.today()) + ".txt", 'a+') as f:
            json.dump(proc_data, f)
            f.write("\n")

我对python有一些基本知识。

3 个答案:

答案 0 :(得分:1)

- Just using string array indices 

    hostmtrcs = "Server:prod01 Available memory:20480 Disk:200 CPU:4 Used memory:12438 Disk:120 CPU:3 Unused memory:8042 " \
                "Disk:80 CPU:1 Server:prod02 Available memory: 40960 Disk:500 CPU:8 Used memory:20888 Disk:320 CPU:3 Unused " \
                "memory:20072 Disk:180 CPU:5 "

datasplt = hostmtrcs.split(":")
hstname = ''
attrkey = ''
attrvalue = ''

for word in range(0, datasplt.__len__()):

    if not datasplt[word].__contains__("Server"):
        elmnt = datasplt[word].split(" ")
        if datasplt[word].__contains__('prod'):
            hstname = elmnt[0].lower()
        if elmnt.__len__() == 3:
            attrkey = elmnt[1].lower() + "_" + elmnt[2].lower()  # attrkey
        else:
            attrkey = elmnt[1]

        # retreive the value from the next element in the 1st attry datasplit

        if word != datasplt.__len__() - 1:
            nxtelmnt = datasplt[word + 1].split(" ")
            attrvalue = nxtelmnt[0]  # sattrvalue frm next element
        finalfrmt = '{' + '"' +hstname + "_" + attrkey + '"' + ":" + attrvalue + '}'
        print(finalfrmt)

答案 1 :(得分:0)

我认为你可以用dict来完成它,然后将其转储为json。(在你的情况下,我不认为其有效的json但根据需要,我根据您的要求将dict转储了json)我没有验证密钥,我假设您正确获取字典数据。

d = { 'Server':'prod01', 
      'Available memory': 20480,
      'Disk':200,
      'CPU':4}

import json

s = json.dumps({str(d['Server']+"_"+key).replace(' ','_'):value for key,value in d.items()})
print(json.loads(s))

>>> {'prod01_Server': 'prod01', 'prod01_Available memory': 20480, 'prod01_Disk': 200, 'prod01_CPU': 4}

答案 2 :(得分:0)

您应该根据要查找的内容逐段拆分输入文本。

data = '''Server:prod01
Available memory: 20480      Disk:200     CPU:4
Used memory:12438              Disk:120     CPU:3
Unused memory:8042            Disk:80       CPU:1
Server:prod02
Available memory: 40960      Disk:500     CPU:8
Used memory:20888              Disk:320     CPU:3
Unused memory:20072          Disk:180    CPU:5'''

import re
import json
print(json.dumps({'_'.join((s, l.split(' ', 1)[0], k)).lower(): int(v) for s, d in [i.split('\n', 1) for i in data.split('Server:') if i] for l in d.split('\n') for k, v in re.findall(r'(\w+):\s*(\d+)', l)}))

这将输出:

{"prod01_available_memory": 20480, "prod01_available_disk": 200, "prod01_available_cpu": 4, "prod01_used_memory": 12438, "prod01_used_disk": 120, "prod01_used_cpu": 3, "prod01_unused_memory": 8042, "prod01_unused_disk": 80, "prod01_unused_cpu": 1, "prod02_available_memory": 40960, "prod02_available_disk": 500, "prod02_available_cpu": 8, "prod02_used_memory": 20888, "prod02_used_disk": 320, "prod02_used_cpu": 3, "prod02_unused_memory": 20072, "prod02_unused_disk": 180, "prod02_unused_cpu": 5}