我正在尝试通过json解析HTTP响应,但它给了我字符错误,但是当我尝试通过for循环遍历此响应时,它会将所有内容拆分为单个字符。是否有更好的方法来解析此响应?
代码:
_url = self.MAIN_URL
try:
_request = self.__webSession.get(_url, cookies=self.__cookies)
if _request.status_code != 200:
self.log("Request failed with code: {}. URL: {}".format(_request.status_code, _url))
return
except Exception as err:
self.log("[e4] Web-request error: {}. URL: {}".format(err, _url))
return
_text = _request.json()
json.loads()返回以下
Expecting value: line 1 column 110 (char 109)
需要解析HTTP响应:
[
[
9266939,
'Value1',
'Value2',
'Value3',
,
'Value4',
[
[
'number',
'number2',
[
'value',
,
'value2'
]
]
]
],
[
5987798,
'Value1',
'Value2',
,
'Value3',
'Value4',
[
[
'number',
'number2',
[
'value',
'value2'
]
]
]
]
]
答案 0 :(得分:0)
虽然错误消息由于行号和列号而令人困惑,但JSON format在任何情况下都不接受字符串的单引号,因此给定的HTTP响应不是JSON格式。你必须为字符串使用双引号。
所以你必须改变这样的输入(如果你控制它):
[
[
9266939,
"Value1",
"Value2",
"Value3",
"Value4",
[
[
"number",
"number2",
[
"value",
"value2"
]
]
...
如果您无法控制正在解析的HTTP响应,则可以在解析之前用双引号替换所有单引号:
http_response_string = (get the HTTP response)
adjusted_http_response_string = http_response_string.replace("'", '"')
data = json.loads(adjusted_http_response_string)
但这当然会带来替换不是字符串分隔符的单引号(或撇号)的潜在风险。但是,它可以充分解决问题,但大部分时间都在工作。
修改强>
根据评论中的要求进一步清理:
http_response_string = (get the HTTP response)
# More advanced replacement of ' with ", expecting
# strings to always come after at least four spaces,
# and always end in either comma, colon, or newline.
adjusted_http_response_string = \
re.sub("( )'", r'\1"',
re.sub("'([,:\n])", r'"\1',
http_response_string))
# Replacing faulty ", ," with ",".
adjusted_http_response_string = \
re.sub(",(\s*,)*", ",",
adjusted_http_response_string)
data = json.loads(adjusted_http_response_string)