我终于不得不放弃并寻求帮助。我正在检索具有json格式的文档(带有请求)(但格式不正确 - 没有双引号)并尝试将数据提取为普通字典。这就是我所拥有的:这样可以获得我尝试提取数据的输出。
def test():
url = "http://www.sgx.com/JsonRead/JsonstData"
payload = {}
payload['qryId'] = 'RSTIc'
payload['timeout'] = 60
header = {'User-Agent' : 'Mozilla/5.0 (compatible; MSIE 10.0; Linux i686; Trident/2.0)', 'Content-Type': 'text/html; charset=utf-8'}
req = requests.get(url, headers = header, params = payload)
print(req.url)
prelim = req.content.decode('utf-8')
print(type(prelim))
print(prelim)
test()
之后我想要的是:(假设一个功能正常的词典)
for stock in prelim['items']:
print(stock['N'])
哪个应该给我一个所有股票名称的清单。
我尝试了大多数json函数:prelim.json(),load。,load。,dump。,dumps。,parse。没有似乎工作,因为数据格式不正确。我也尝试了ast.literal_eval()但没有成功。我在Stack Overflow上尝试了一些例子来在正确的字典中转换该字符串,但没有运气。我似乎无法转换该字符串,使其表现为正确的字典。如果你能指出我正确的方向,将非常感激。
好的samaritains要求提供数据的例子。来自上述请求的数据有点长,但我删除了一些'项目'所以人们可以看到检索数据的一般外观。
{}&安培;&安培; {标识符:' ID',标签:'截至19-03-2018 8:38 AM',项目:[{ID:0,N:' AscendasReit&#39 ;,SIP:'',NC:' A17U',R:'',I:'',M:&# 39;',LT:0,C:0,VL:97.600,BV:485.300,B:' 2.670',S:' 2.670',SV:1009.100 ,O:0,H:0,L:0,V:259811.200,SC:' 9',PV:2.660,P:0,P _:' X',V _:& #39;'}, {ID:1,N:' CapitaComTrust',SIP:'',NC:' C61U',R:'',我:'',M:'',LT:0,C:0,VL:126.349,BV:1467.300,B:' 1.800',S :' 1.800',SV:620.900,O:0,H:0,L:0,V:228691.690,SC:' 9',PV:1.810,P:0, P 450:' X',V _:''}, {ID:2,N:'凯德',SIP:'',NC:' C31',R:'',我:'',M:'',LT:0,C:0,VL:78.000,BV:184.900,B:' 3.670',S :' 3.670',SV:372.900,O:0,H:0,L:0,V:286026.000,SC:' 9',PV:3.660,P:0, P 450:' X',V _:''}, {ID:28,N:' Wilmar Intl',SIP:'',NC:' F34',R:' CD', I:'',M:'',LT:0,C:0,VL:0.000,BV:32.000,B:' 3.210&#39 ;, S:' 3.210',SV:73.100,O:0,H:0,L:0,V:0.000,SC:' 2',PV:3.220,P:0 ,P _:'',V _:''}, {ID:29,N:' YZJ Shipbldg SGD',SIP:'',NC:' BS6',R:'' ,I:'',M:'',LT:0,C:0,VL:0.000,BV:349.500,B:' 1.330' ,S:' 1.330',SV:417.700,O:0,H:0,L:0,V:0.000,SC:' 2',PV:1.340,P: 0,p _:'',V _:''}]}
根据最近的评论,我知道我可以这样做:
def test2():
my_text = "{}&& {identifier:'ID', label:'As at 19-03-2018 8:38 AM',items:[{ID:0,N:'AscendasReit',SIP:'',NC:'A17U',R:'',I:'',M:'',LT:0,C:0,VL:97.600,BV:485.300,B:'2.670',S:'2.670',SV:1009.100,O:0,H:0,L:0,V:259811.200,SC:'9',PV:2.660,P:0,P_:'X',V_:''}, {ID:1,N:'CapitaComTrust',SIP:'',NC:'C61U',R:'',I:'',M:'',LT:0,C:0,VL:126.349,BV:1467.300,B:'1.800',S:'1.800',SV:620.900,O:0,H:0,L:0,V:228691.690,SC:'9',PV:1.810,P:0,P_:'X',V_:''}, {ID:2,N:'CapitaLand',SIP:'',NC:'C31',R:'',I:'',M:'',LT:0,C:0,VL:78.000,BV:184.900,B:'3.670',S:'3.670',SV:372.900,O:0,H:0,L:0,V:286026.000,SC:'9',PV:3.660,P:0,P_:'X',V_:''}, {ID:28,N:'Wilmar Intl',SIP:'',NC:'F34',R:'CD',I:'',M:'',LT:0,C:0,VL:0.000,BV:32.000,B:'3.210',S:'3.210',SV:73.100,O:0,H:0,L:0,V:0.000,SC:'2',PV:3.220,P:0,P_:'',V_:''}, {ID:29,N:'YZJ Shipbldg SGD',SIP:'',NC:'BS6',R:'',I:'',M:'',LT:0,C:0,VL:0.000,BV:349.500,B:'1.330',S:'1.330',SV:417.700,O:0,H:0,L:0,V:0.000,SC:'2',PV:1.340,P:0,P_:'',V_:''}]}"
prelim = my_text.split("items:[")[1].replace("}]}", "}")
temp_list = prelim.split(", ")
end_list = []
main_dict = {}
for tok1 in temp_list:
temp_dict = {}
temp = tok1.replace("{","").replace("}","").split(",")
for tok2 in temp:
my_key = tok2.split(":")[0]
my_value = tok2.split(":")[1].replace("'","")
temp_dict[my_key] = my_value
end_list.append(temp_dict)
main_dict['items'] = end_list
for stock in main_dict['items']:
print(stock['N'])
test2()
这是期望的结果。我只想问,如果有更容易(更优雅/ pythonic)的方式这样做。
答案 0 :(得分:0)
首先需要将字符串转换为JSON可转换文本然后使用json.loads
来获取字典
prelim
不是JSON格式,值不会被"
'{}&& '
"
json.loads(new_text)
以获取字典表示即
import requests, json
#replace tuples
reps = (('identifier:', '"identifier":'),
('label:', '"label":'),
('items:', '"items":'),
('NC:', '"NC":'),
('ID:', '"ID":'),
('N:', '"N":'),
('SIP:', '"SIP":'),
('SC:', '"SC":'),
('R:', '"R":'),
('I:', '"I":'),
('M:', '"M":'),
('LT:', '"LT":'),
('C:', '"C":'),
('VL:', '"VL":'),
('BV:', '"BV":'),
('BL:', '"BL":'),
('B:', '"B":'),
('S:', '"S":'),
('SV:', '"SV":'),
('O:', '"O":'),
('H:', '"H":'),
('L:', '"L":'),
('PV:', '"PV":'),
('V:', '"V":'),
('P_:', '"P_":'),
('P:', '"P":'),
('V_:', '"V_":'))
#getting rid of invalid json text
prelim = prelim.replace('{}&& ', '')
#replacing single quotes with double quotes
prelim = prelim.replace("'", "\"")
print(prelim)
#reduce to get all replacements
dict_text = fn.reduce(lambda a, kv: a.replace(*kv), reps, prelim)
dic = json.loads(dict_text)
print(dic)
获取商品:
for x in dic['items']:
print(x['N'])
<强>输出:强>
2ndChance W200123
3Cnergy
3Cnergy W200528
800 Super
8Telecom^
A-Smart
A-Sonic Aero^
AA
....