在这里需要一些建议。我有一个文本文件,其中包含一些需要提取并另存为JSON文件的信息。该文件在块中是非结构化的。请在下面找到:
我该如何实现?我只是不知道如何开始。 我有找到类型:路由器的想法,但是如何在每个块上进行迭代,而仅选择P-2-P块详细信息。感谢您的建议。
Type : Router
Ls id : 1.1.1.2
Adv rtr : 1.1.1.2
Ls age : 201
Len : 84
Link count: 5
* Link ID: 1.1.1.2
Data : 255.255.255.255
Link Type: StubNet
Metric : 1
Priority : Medium
* Link ID: 1.1.1.4
Data : 192.168.100.34
Link Type: P-2-P
Metric : 1
* Link ID: 192.168.100.33
Data : 255.255.255.255
Link Type: StubNet
Metric : 1
Priority : Medium
* Link ID: 1.1.1.1
Data : 192.168.100.53
Link Type: P-2-P
Metric : 1
* Link ID: 192.168.100.54
Data : 255.255.255.255
Link Type: StubNet
Metric : 1
Priority : Medium
Type : Router
Ls id : 1.1.1.1
Adv rtr : 1.1.1.1
Ls age : 1699
Len : 96
Options : ASBR E
seq# : 80008d72
chksum : 0x16fc
Link count: 6
* Link ID: 1.1.1.1
Data : 255.255.255.255
Link Type: StubNet
Metric : 1
Priority : Medium
* Link ID: 1.1.1.1
Data : 255.255.255.255
Link Type: StubNet
Metric : 12
Priority : Medium
* Link ID: 1.1.1.3
Data : 192.168.100.26
Link Type: P-2-P
Metric : 10
* Link ID: 192.168.100.25
Data : 255.255.255.255
Link Type: StubNet
Metric : 10
Priority : Medium
* Link ID: 1.1.1.2
Data : 192.168.100.54
Link Type: P-2-P
Metric : 10
* Link ID: 192.168.100.53
Data : 255.255.255.255
Link Type: StubNet
Metric : 10
Priority : Medium
仅提取具有以下类型的每个块:路由器。在此块中,要捕获的信息是:
(1)Ls id : 1.1.1.2
and under link count, info to capture is block that only have link type:P-2-P
(a)Link ID: 1.1.1.4
(b)Data : 192.168.100.34
(c)Link Type: P-2-P
(d)Metric : 1
(a)Link ID: 1.1.1.3
(b)Data : 192.168.100.53
(c)Link Type: P-2-P
(d)Metric : 1
Then for another Type: Router block. To capture
(2)Ls id : 1.1.1.1
and under link count, info to capture is block that only have link type:P-2-P
(a)Link ID: 1.1.1.3
(b)Data : 192.168.100.26
(c)Link Type: P-2-P
(d)Metric : 10
(a)Link ID: 1.1.1.2
(b)Data : 192.168.100.54
(c)Link Type: P-2-P
(d)Metric : 10
**There is another Link Type (StubNet) but the only interested to capture is block that have Link Type:P-2-P**
在JSON中预期如下:
{
"oppf": [
{
"Sid": "1.1.1.2",
"Did": "1.1.1.4",
"Sport": " 192.168.100.34",
"Netype": "P-2-P",
"Metric": "1"
},
{
"Sid": "1.1.1.2",
"Did": "1.1.1.1",
"Sport": " 192.168.100.53",
"Netype": "P-2-P",
"Metric": "1"
},
{
"Sid": "1.1.1.1",
"Did": "1.1.1.3",
"Sport": " 192.168.100.26",
"Netype": "P-2-P",
"Metric": "10"
},
{
"Sid": "1.1.1.1",
"Did": "1.1.1.2",
"Sport": " 192.168.100.54",
"Netype": "P-2-P",
"Metric": "10"
}
],
}
答案 0 :(得分:1)
对我来说,它的结构很好。它具有不同的缩进来识别子项,*
可以识别新字典的开始,而空行则可以识别新的路线。它还具有:
来拆分行并获取键和值。
data = ''' Type : Router
Ls id : 1.1.1.2
Adv rtr : 1.1.1.2
Ls age : 201
Len : 84
Link count: 5
* Link ID: 1.1.1.2
Data : 255.255.255.255
Link Type: StubNet
Metric : 1
Priority : Medium
* Link ID: 1.1.1.4
Data : 192.168.100.34
Link Type: P-2-P
Metric : 1
* Link ID: 192.168.100.33
Data : 255.255.255.255
Link Type: StubNet
Metric : 1
Priority : Medium
* Link ID: 1.1.1.1
Data : 192.168.100.53
Link Type: P-2-P
Metric : 1
* Link ID: 192.168.100.54
Data : 255.255.255.255
Link Type: StubNet
Metric : 1
Priority : Medium
Type : Router
Ls id : 1.1.1.1
Adv rtr : 1.1.1.1
Ls age : 1699
Len : 96
Options : ASBR E
seq# : 80008d72
chksum : 0x16fc
Link count: 6
* Link ID: 1.1.1.1
Data : 255.255.255.255
Link Type: StubNet
Metric : 1
Priority : Medium
* Link ID: 1.1.1.1
Data : 255.255.255.255
Link Type: StubNet
Metric : 12
Priority : Medium
* Link ID: 1.1.1.3
Data : 192.168.100.26
Link Type: P-2-P
Metric : 10
* Link ID: 192.168.100.25
Data : 255.255.255.255
Link Type: StubNet
Metric : 10
Priority : Medium
* Link ID: 1.1.1.2
Data : 192.168.100.54
Link Type: P-2-P
Metric : 10
* Link ID: 192.168.100.53
Data : 255.255.255.255
Link Type: StubNet
Metric : 10
Priority : Medium'''
results = []
group = {}
group['items'] = []
subgroup = None
for line in data.split('\n'):
if not line.strip():
results.append(group)
group = {}
group['items'] = []
subgroup = None
elif not line.startswith(' '):
key, val = line.split(':')
key = key.strip()
val = val.strip()
group[key] = val
else:
if '*' in line:
if subgroup:
group['items'].append(subgroup)
subgroup = {}
key, val = line.split(':')
key = key.replace('*', '').strip()
val = val.strip()
subgroup[key] = val
group['items'].append(subgroup)
results.append(group)
print(results)
并很好地显示
import json
print(json.dumps(results, indent=2))
结果:
[
{
"items": [
{
"Link ID": "1.1.1.2",
"Data": "255.255.255.255",
"Link Type": "StubNet",
"Metric": "1",
"Priority": "Medium"
},
{
"Link ID": "1.1.1.4",
"Data": "192.168.100.34",
"Link Type": "P-2-P",
"Metric": "1"
},
{
"Link ID": "192.168.100.33",
"Data": "255.255.255.255",
"Link Type": "StubNet",
"Metric": "1",
"Priority": "Medium"
},
{
"Link ID": "1.1.1.1",
"Data": "192.168.100.53",
"Link Type": "P-2-P",
"Metric": "1"
}
],
"Type": "Router",
"Ls id": "1.1.1.2",
"Adv rtr": "1.1.1.2",
"Ls age": "201",
"Len": "84",
"Link count": "5"
},
{
"items": [
{
"Link ID": "1.1.1.1",
"Data": "255.255.255.255",
"Link Type": "StubNet",
"Metric": "1",
"Priority": "Medium"
},
{
"Link ID": "1.1.1.1",
"Data": "255.255.255.255",
"Link Type": "StubNet",
"Metric": "12",
"Priority": "Medium"
},
{
"Link ID": "1.1.1.3",
"Data": "192.168.100.26",
"Link Type": "P-2-P",
"Metric": "10"
},
{
"Link ID": "192.168.100.25",
"Data": "255.255.255.255",
"Link Type": "StubNet",
"Metric": "10",
"Priority": "Medium"
},
{
"Link ID": "1.1.1.2",
"Data": "192.168.100.54",
"Link Type": "P-2-P",
"Metric": "10"
},
{
"Link ID": "192.168.100.53",
"Data": "255.255.255.255",
"Link Type": "StubNet",
"Metric": "10",
"Priority": "Medium"
}
],
"Type": "Router",
"Ls id": "1.1.1.1",
"Adv rtr": "1.1.1.1",
"Ls age": "1699",
"Len": "96",
"Options": "ASBR E",
"seq#": "80008d72",
"chksum": "0x16fc",
"Link count": "6"
}
]
所以现在您有了Python结构,就可以得到想要的东西。
答案 1 :(得分:1)
仅获取P-2-P类型:
data = "..."
import json
result = {}
l = []
for i in data.split("\n\n"):
if i:
p = [parameter for parameter in i.split("*")]
for line, x in enumerate(p[0].split("\n")):
if x and "Ls id" in x:
ls_id, ip = x.split(": ")
ls_id = ls_id.strip()
ip = ip.strip()
for y in p[1:]:
if y and "P-2-P" in y:
temp = {ls_id:ip}
for items in y.split("\n"):
try:
key, value = items.split(": ")
key = key.strip()
value = value.strip()
temp[key] = value
except ValueError:
pass
l.append(temp)
result["oppf"] = l
print (json.dumps(result,indent=2))