我是Python编程的新手,但遇到了一些麻烦。我有一个这样组织的文本文件(.dat)
{
"token1": [Array of numbers], // metadata, that has to be ignored
"token2": 5000,
"token3": 16.8,
"token4": -7118,
"token5": "2017-11-12 15:38:50",
"token6": false,
"token7": ["LowHor", "LowVer", "HighHor", "HighVer"],
"token8": "RadarID-3",
...
}, ... 50 examples
//
import re
openText = open('bird_2017-11-12_15-38-42.dat')
text = openText.read()
openText.close()
keywords = ['Ceil_H_m', 'Ceil_Vx_mps', 'Ceil_Vy_mps', 'Ceil_Vz_mps',
'Ceil_X_m', 'Ceil_Y_m', 'DateTimeCeil', 'DateTimeFile', 'IsCeilInMeteo',
'IsCeilInNoises', 'Lambda_m', 'NamesChannels', 'NumChannels',
'NumRangesPack', 'NumRaysPack', 'POI_Az_deg', 'POI_Height_m',
'POI_Range_m', 'RadarID']
samples = text.count('TrackNumber') // metadata, that every example has
data = []
//
I need a 2dimensional array output like this
number of example 0 1 ............ 50
----------------------------------------------------------
properties
token2 5000
token3 16.8
token4 -7118
token5 2017-11-12 15:38:50
token6 false
token7 ["LowHor", "LowVer", "HighHor", "HighVer"]
token8 RadarID-3
关键字实际上是上述令牌。我曾尝试使用这些关键字来提取令牌的属性,但没有成功(re.match())
答案 0 :(得分:0)
看起来您的输入文件可能几乎是JSON。具体来说,如果将输入文件的文本嵌入方括号中,则其语法可能为JSONArray。如果是这样,这将为您提供大部分所需的东西:
import json, collections
file_text = open('bird_2017-11-12_15-38-42.dat').read()
json_text = '[' + file_text + ']'
examples = json.loads(json_text)
transpose = collections.defaultdict(list)
for example in examples:
for (keyword, value) in example.items():
if keyword == 'token1':
# metadata that has to be ignored
continue
transpose[keyword].append(value)
for (keyword, values) in transpose.items():
print(keyword, values)
这假定每个示例都具有完全相同的关键字集。如果不是这种情况,则需要修改代码。
答案 1 :(得分:0)
好像您的数据是JSON格式,您只需添加[]
即可将其添加到列表中。
内容为 file.txt
:
{
"token1": [1, 2, 3],
"token2": 5000,
"token3": 16.8,
"token4": -7118
},
{
"token1": [1, 2, 3],
"token2": 5001,
"token3": 16.9,
"token4": -6118
},
{
"token1": [1, 2, 3],
"token2": 5002,
"token3": 17.8,
"token4": -5118
},
{
"token1": [1, 2, 3],
"token2": 5003,
"token3": 15.8,
"token4": -3118
}
脚本可能看起来像这样:
import json
with open('file.txt', 'r') as f_in:
data = f_in.read()
data = json.loads('[' + data + ']')
keys = [*sorted(data[-1].keys())][1:]
columns = [[v for k, v in sorted(d.items())][1:] for d in data] # [1:] because we don't want the first "token1"
print('{: ^20}'.format('no of example') + ''.join('{: ^20}'.format(i) for i in range(len(columns))))
print('-' * (20 * (len(columns) + 1)))
for v in zip(keys, *columns):
print(''.join('{: ^20}'.format(i) for i in v))
打印:
no of example 0 1 2 3
----------------------------------------------------------------------------------------------------
token2 5000 5001 5002 5003
token3 16.8 16.9 17.8 15.8
token4 -7118 -6118 -5118 -3118