Python在JSON数据中查找和替换

时间:2019-05-03 00:10:29

标签: sql python-3.x

API响应包含撇号('),该撇号将删除后续的SQL代码。在将JSON对象发送到SQL DB之前,如何查找和替换字符?

{
"num_results": 455161,
"results": [
    {
        "activity_date": "1975-12-01",
        "activity_id": "50",
        "activity_name": "ORDERED",
        "activity_remark": "FOR DELIVERY 1976-04.",
        "operator_country_lar": "France",
        "operator_country_lar_id": "865",
        "operator_id": "2786"
    },
    {
        "activity_date": "1974-10-01",
        "activity_id": "50",
        "activity_name": "ORDERED",
        "activity_remark": "FOR DELIVERY 1976-04.",
        "operator_country_lar": "Korea, Democratic People's Republic of",
        "operator_country_lar_id": "206",
        "operator_id": "29080"
    }
],
"results_this_page": 2,
"status": 200}

我尝试将JSON转换为str,然后将.replace(“'”,“”)转换为JSON,但是数据无法返回为JSON。

convert_str = str(self.response.json())

convert_str = convert_str.replace("'","")

print(json.dumps(convert_str, sort_keys=True, indent=4))    

2 个答案:

答案 0 :(得分:1)

这将成功删除不需要的撇号。

>>> d = {
     'num_results': 455161,
     'results': [{'activity_date': '1975-12-01',
                  'activity_id': '50',
                  'activity_name': 'ORDERED',
                  'activity_remark': 'FOR DELIVERY 1976-04.',
                  'operator_country_lar': 'France',
                  'operator_country_lar_id': '865',
                  'operator_id': '2786'},
                 {'activity_date': '1974-10-01',
                  'activity_id': '50',
                  'activity_name': 'ORDERED',
                  'activity_remark': 'FOR DELIVERY 1976-04.',
                  'operator_country_lar': "Korea, Democratic People's Republic of",
                  'operator_country_lar_id': '206',
                  'operator_id': '29080'}],
     'results_this_page': 2,
     'status': 200}
>>> 
>>> pprint.pprint(json.loads(json.dumps(d).replace("'", "")))
{'num_results': 455161,
 'results': [{'activity_date': '1975-12-01',
              'activity_id': '50',
              'activity_name': 'ORDERED',
              'activity_remark': 'FOR DELIVERY 1976-04.',
              'operator_country_lar': 'France',
              'operator_country_lar_id': '865',
              'operator_id': '2786'},
             {'activity_date': '1974-10-01',
              'activity_id': '50',
              'activity_name': 'ORDERED',
              'activity_remark': 'FOR DELIVERY 1976-04.',
              'operator_country_lar': 'Korea, Democratic Peoples Republic of',
              'operator_country_lar_id': '206',
              'operator_id': '29080'}],
 'results_this_page': 2,
 'status': 200}

对于operator_country_lar,您可以使用双引号“ People's”, 或通过反击“人民”逃脱。

而不是浪费整个JSON字符串, 您可能会发现访问每个dict key,val项目很有帮助 并修改单个val字符串。 例如:

for result in d['results']:
    for k, v in result.items():
        result[k] = v.replace("'", "")
  

API响应包含撇号('),该撇号会删除后续的SQL代码。

这听起来像您已经设法对自己发起了sql注入攻击。 回忆一下Bobby Tables的教训。

为适当的目的使用适当的数据库API很重要。 而不是将带引号的字符串放在WHERE子句中, 更好地将它们作为单独的绑定参数传递 因此甚至没有出现报价问题。

答案 1 :(得分:1)

我在将大型JSON文件作为二进制大对象存储在PostgreSQL数据库中时遇到了类似的问题。我发现使用astliteral_eval解决方案可以很好地用于序列化和反序列化潜在易失的文本:

import json
from ast import literal_eval

s = ('''[
       {
        "activity_date": "1975-12-01",
        "activity_id": "50",
        "activity_name": "ORDERED",
        "activity_remark": "FOR DELIVERY 1976-04.",
        "operator_country_lar": "France",
        "operator_country_lar_id": "865",
        "operator_id": "2786"
       },
       {
        "activity_date": "1974-10-01",
        "activity_id": "50",
        "activity_name": "ORDERED",
        "activity_remark": "FOR DELIVERY 1976-04.",
        "operator_country_lar": "Korea, Democratic People's Republic of",
        "operator_country_lar_id": "206",
        "operator_id": "29080"
       }
     ]''')

s = literal_eval(s)
d = json.dumps(s)
l = json.loads(d)

print(s)
print("")
print(d)
print("")
print(l)

"""

[{'activity_date': '1975-12-01', 'activity_id': '50', 'activity_name': 'ORDERED', 'activity_remark': 'FOR DELIVERY 1976-04.', 'operator_country_lar': 'France', 'operator_country_lar_id': '865', 'operator_id': '2786'}, {'activity_date': '1974-10-01', 'activity_id': '50', 'activity_name': 'ORDERED', 'activity_remark': 'FOR DELIVERY 1976-04.', 'operator_country_lar': "Korea, Democratic People's Republic of", 'operator_country_lar_id': '206', 'operator_id': '29080'}]

[{"activity_date": "1975-12-01", "activity_id": "50", "activity_name": "ORDERED", "activity_remark": "FOR DELIVERY 1976-04.", "operator_country_lar": "France", "operator_country_lar_id": "865", "operator_id": "2786"}, {"activity_date": "1974-10-01", "activity_id": "50", "activity_name": "ORDERED", "activity_remark": "FOR DELIVERY 1976-04.", "operator_country_lar": "Korea, Democratic People's Republic of", "operator_country_lar_id": "206", "operator_id": "29080"}]

[{'activity_date': '1975-12-01', 'activity_id': '50', 'activity_name': 'ORDERED', 'activity_remark': 'FOR DELIVERY 1976-04.', 'operator_country_lar': 'France', 'operator_country_lar_id': '865', 'operator_id': '2786'}, {'activity_date': '1974-10-01', 'activity_id': '50', 'activity_name': 'ORDERED', 'activity_remark': 'FOR DELIVERY 1976-04.', 'operator_country_lar': "Korea, Democratic People's Republic of", 'operator_country_lar_id': '206', 'operator_id': '29080'}]

"""