来自人类可读文本的Python JSON数据

时间:2018-05-17 14:20:22

标签: python json

我正在研究一个接受.log文件POSTS(基本上只是一个文本文件)的烧瓶服务器。这些文件包含调用命令行smartctl命令生成的数据

  

smartctl -a / dev / sda

我想将这些人类可读的信息解析为JSON数据(仍然是人类可读的lol)并将生成的JSON数据发送到数据库。

我知道smartctl github上的开发分支具有检索JSON数据的功能(版本6.7)并将其保存到带有参数的文件

  

-j

但是,我无法使用此功能,因为我需要使用稳定版本的smartctl。

以下是我目前获得的输出:

{
"  3 Spin_Up_Time            ": "027   200   199   021    Pre-fail  Always       -       983",
"  4 Start_Stop_Count        ": "032   100   100   000    Old_age   Always       -       30",
"  5 Reallocated_Sector_Ct   ": "033   200   200   140    Pre-fail  Always       -       0",
"  7 Seek_Error_Rate         ": "02e   200   200   000    Old_age   Always       -       0",
"  9 Power_On_Hours          ": "032   049   049   000    Old_age   Always       -       37855",
" 10 Spin_Retry_Count        ": "032   100   253   000    Old_age   Always       -       0",
" 11 Calibration_Retry_Count ": "032   100   253   000    Old_age   Always       -       0",
" 12 Power_Cycle_Count       ": "032   100   100   000    Old_age   Always       -       29",
"192 Power-Off_Retract_Count ": "032   200   200   000    Old_age   Always       -       28",
"193 Load_Cycle_Count        ": "032   200   200   000    Old_age   Always       -       1",
"194 Temperature_Celsius     ": "022   105   102   000    Old_age   Always       -       38",
"196 Reallocated_Event_Count ": "032   200   200   000    Old_age   Always       -       0",
"197 Current_Pending_Sector  ": "032   200   200   000    Old_age   Always       -       0",
"198 Offline_Uncorrectable   ": "030   200   200   000    Old_age   Offline      -       0",
"199 UDMA_CRC_Error_Count    ": "032   200   200   000    Old_age   Always       -       0",
"200 Multi_Zone_Error_Rate   ": "008   200   200   000    Old_age   Offline      -       0",
"ATA Version is": "ATA8-ACS (minor revision not indicated)",
"Add. Product Id": "DELL(tm)",
"Device Model": "WDC WD2502ABYS-18B7A0",
"Firmware Version": "02.03B05",
"LU WWN Device Id": "<numbers>",
"Local Time is": "Mon Apr 23 10",
"Model Family": "Western Digital RE3 Serial ATA",
"Rotation Rate": "7200 rpm",
"SATA Version is": "SATA 2.5, 3.0 Gb/s",
"SMART overall-health self-assessment test result": "PASSED",
"SMART support is": "Enabled",
"Sector Size": "512 bytes logical/physical",
"Serial Number": "<serial>",
"User Capacity": "250.000.000.000 bytes [250 GB]"

}

智能属性意味着以下内容:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

设备信息可能很好,但智能属性可能在其当前形式中没有用;大多数值都有空格,制表符和键。我不确定如何正确解析它。我需要一些子代码来存储智能属性,但我很遗憾。

我注意到文件的长度是固定的(或者至少,长度对我来说从来没有不同)所以我用它来便宜地解析它。我目前的代码如下:

def parse_line(line): #for colons
splitted = line.split(':')
return splitted[0], splitted[1].strip()

def parse_line_smart(line): #for smart attributes
splitted = line.split("0x0")
return splitted[0], splitted[1].strip()

    lines = file_body.split("\n")
    value =""
    key =""

    my_data = {}  # Empty dictionary object
    for line in lines[4:22]: #device info
        if ":" in line:
            if line.startswith("Device is:"):
                pass
            else:
                key, value = parse_line(line)

        my_data[key] = value

    for line in lines[61:77]: #Smart attributes
        if "0x0" in line:
            key, value = parse_line_smart(line)
        my_data[key] = value

    if 'Raw_Read_Error_Rate' in my_data:
        pass  # Parse some more

    json_data = json.dumps(my_data)
    print(json_data)

设备信息可能需要一个子字典,如:

"device" : {
"model_family" : "Western Digital RE3 Serial ATA",
"model_name" : "WDC WD2502ABYS-18B7A0",
},

我希望智能属性输出类似于:

"smart_attributes" : {
"table" : [
  {
    "id" : 1,
    "name" : "Raw_Read_Error_Rate",
    "value" : 200,
    "worst" : 200,
    "thresh" : 51,
    "when_failed" : "",
    "flags" : "0x002f",
    "raw" : "0"
    }
  },

0 个答案:

没有答案