将json / dict的字符串表示形式转换为可用于python请求的字符串表示形式

时间:2018-03-19 00:35:31

标签: python json dictionary python-requests

我终于不得不放弃并寻求帮助。我正在检索具有json格式的文档(带有请求)(但格式不正确 - 没有双引号)并尝试将数据提取为普通字典。这就是我所拥有的:这样可以获得我尝试提取数据的输出。

def test():
    url = "http://www.sgx.com/JsonRead/JsonstData"
    payload = {}
    payload['qryId'] = 'RSTIc'
    payload['timeout'] = 60
    header = {'User-Agent' : 'Mozilla/5.0 (compatible; MSIE 10.0; Linux i686; Trident/2.0)', 'Content-Type': 'text/html; charset=utf-8'}
    req = requests.get(url, headers = header, params = payload)
    print(req.url)
    prelim = req.content.decode('utf-8')
    print(type(prelim))
    print(prelim)

test()

之后我想要的是:(假设一个功能正常的词典)

for stock in prelim['items']:
    print(stock['N'])

哪个应该给我一个所有股票名称的清单。

我尝试了大多数json函数:prelim.json(),load。,load。,dump。,dumps。,parse。没有似乎工作,因为数据格式不正确。我也尝试了ast.literal_eval()但没有成功。我在Stack Overflow上尝试了一些例子来在正确的字典中转换该字符串,但没有运气。我似乎无法转换该字符串,使其表现为正确的字典。如果你能指出我正确的方向,将非常感激。

好的samaritains要求提供数据的例子。来自上述请求的数据有点长,但我删除了一些'项目'所以人们可以看到检索数据的一般外观。

  

{}&安培;&安培; {标识符:' ID',标签:'截至19-03-2018 8:38 AM',项目:[{ID:0,N:' AscendasReit&#39 ;,SIP:'',NC:' A17U',R:'',I:'',M:&# 39;',LT:0,C:0,VL:97​​.600,BV:485.300,B:' 2.670',S:' 2.670',SV:1009.100 ,O:0,H:0,L:0,V:25​​9811.200,SC:' 9',PV:2.660,P:0,P _:' X',V _:& #39;'},   {ID:1,N:' CapitaComTrust',SIP:'',NC:' C61U',R:'',我:'',M:'',LT:0,C:0,VL:126.349,BV:1467.300,B:' 1.800',S :' 1.800',SV:620.900,O:0,H:0,L:0,V:228691.690,SC:' 9',PV:1.810,P:0, P 450:' X',V _:''},   {ID:2,N:'凯德',SIP:'',NC:' C31',R:'',我:'',M:'',LT:0,C:0,VL:78.000,BV:184.900,B:' 3.670',S :' 3.670',SV:372.900,O:0,H:0,L:0,V:286026.000,SC:' 9',PV:3.660,P:0, P 450:' X',V _:''},   {ID:28,N:' Wilmar Intl',SIP:'',NC:' F34',R:' CD', I:'',M:'',LT:0,C:0,VL:0.000,BV:32.000,B:' 3.210&#39 ;, S:' 3.210',SV:73.100,O:0,H:0,L:0,V:0.000,SC:' 2',PV:3.220,P:0 ,P _:'',V _:''},   {ID:29,N:' YZJ Shipbldg SGD',SIP:'',NC:' BS6',R:'' ,I:'',M:'',LT:0,C:0,VL:0.000,BV:349.500,B:' 1.330' ,S:' 1.330',SV:417.700,O:0,H:0,L:0,V:0.000,SC:' 2',PV:1.340,P: 0,p _:'',V _:''}]}

根据最近的评论,我知道我可以这样做:

def test2():
    my_text = "{}&& {identifier:'ID', label:'As at 19-03-2018 8:38 AM',items:[{ID:0,N:'AscendasReit',SIP:'',NC:'A17U',R:'',I:'',M:'',LT:0,C:0,VL:97.600,BV:485.300,B:'2.670',S:'2.670',SV:1009.100,O:0,H:0,L:0,V:259811.200,SC:'9',PV:2.660,P:0,P_:'X',V_:''}, {ID:1,N:'CapitaComTrust',SIP:'',NC:'C61U',R:'',I:'',M:'',LT:0,C:0,VL:126.349,BV:1467.300,B:'1.800',S:'1.800',SV:620.900,O:0,H:0,L:0,V:228691.690,SC:'9',PV:1.810,P:0,P_:'X',V_:''}, {ID:2,N:'CapitaLand',SIP:'',NC:'C31',R:'',I:'',M:'',LT:0,C:0,VL:78.000,BV:184.900,B:'3.670',S:'3.670',SV:372.900,O:0,H:0,L:0,V:286026.000,SC:'9',PV:3.660,P:0,P_:'X',V_:''}, {ID:28,N:'Wilmar Intl',SIP:'',NC:'F34',R:'CD',I:'',M:'',LT:0,C:0,VL:0.000,BV:32.000,B:'3.210',S:'3.210',SV:73.100,O:0,H:0,L:0,V:0.000,SC:'2',PV:3.220,P:0,P_:'',V_:''}, {ID:29,N:'YZJ Shipbldg SGD',SIP:'',NC:'BS6',R:'',I:'',M:'',LT:0,C:0,VL:0.000,BV:349.500,B:'1.330',S:'1.330',SV:417.700,O:0,H:0,L:0,V:0.000,SC:'2',PV:1.340,P:0,P_:'',V_:''}]}"
    prelim = my_text.split("items:[")[1].replace("}]}", "}")
    temp_list = prelim.split(", ")
    end_list = []
    main_dict = {}
    for tok1 in temp_list:
        temp_dict = {}
        temp = tok1.replace("{","").replace("}","").split(",")
        for tok2 in temp:            
            my_key = tok2.split(":")[0]
            my_value = tok2.split(":")[1].replace("'","")
            temp_dict[my_key] = my_value
        end_list.append(temp_dict)    
    main_dict['items'] = end_list
    for stock in main_dict['items']:
        print(stock['N'])

test2()

这是期望的结果。我只想问,如果有更容易(更优雅/ pythonic)的方式这样做。

1 个答案:

答案 0 :(得分:0)

首先需要将字符串转换为JSON可转换文本然后使用json.loads来获取字典

prelim不是JSON格式,值不会被"

包围
  • 删除'{}&& '
  • 使用"
  • 的环绕声属性
  • 应用json.loads(new_text)以获取字典表示

import requests, json
#replace tuples 
reps = (('identifier:', '"identifier":'),
        ('label:', '"label":'),
        ('items:', '"items":'),
        ('NC:', '"NC":'),
        ('ID:', '"ID":'),
        ('N:', '"N":'),
        ('SIP:', '"SIP":'),
        ('SC:', '"SC":'),
        ('R:', '"R":'),
        ('I:', '"I":'),
        ('M:', '"M":'),
        ('LT:', '"LT":'),
        ('C:', '"C":'),
        ('VL:', '"VL":'),
        ('BV:', '"BV":'),
        ('BL:', '"BL":'),
        ('B:', '"B":'),
        ('S:', '"S":'),
        ('SV:', '"SV":'),
        ('O:', '"O":'),
        ('H:', '"H":'),
        ('L:', '"L":'),
        ('PV:', '"PV":'),
        ('V:', '"V":'),
        ('P_:', '"P_":'),
        ('P:', '"P":'),
        ('V_:', '"V_":'))

#getting rid of invalid json text
prelim = prelim.replace('{}&& ', '')

#replacing single quotes with double quotes
prelim = prelim.replace("'", "\"")

print(prelim)
#reduce to get all replacements
dict_text = fn.reduce(lambda a, kv: a.replace(*kv), reps, prelim)
dic = json.loads(dict_text)
print(dic)

获取商品:

for x in dic['items']:
    print(x['N'])

<强>输出:

2ndChance W200123
3Cnergy
3Cnergy W200528
800 Super
8Telecom^
A-Smart
A-Sonic Aero^
AA
....