一个json脚本作为字符串传递,我需要提取AdminSerializer
之后的数字值以进行进一步的映射。下面的示例数据:
class AdminSerializer(serializers.ModelSerializer):
user = AdminUserSerializer() # make sure user_type is read-only in whatever serializer you specify here
class Meta:
model = models.Admin
fields = ('user', 'first_name', 'last_name', 'dob', 'gender')
def create(self, validated_data):
user_data = validated_data.pop('user')
user = models.User.objects.create(**user_data, user_type=constants.Constants.ADMIN)
admin = models.Admin.objects.create(user=user, **validated_data)
return admin
这些参数是动态的,因此我无法使用substr函数提取或计数在出现一定数量的特殊字符后无法提取。
答案 0 :(得分:0)
JSON格式不正确,它包含多余的]
和在关闭}
之后的尾巴。对于正确的JSON,您可以使用get_json_object
,例如:
select get_json_object(src_json,'$.url.content_id') from
(
select '{"url": {"phone": "videos/hssportint/hssport/jocaasd/6_3818e20a9e/19098311205/phone", "tv": "/mnt/c81292786e1e368e12144c302007/output/", "sample_aspect_ratio": "1:1", "subsample": 25, "content_id": "1000231205", "encryption_enabled": false, "non_ad_time_intervals": [2330.68, 2898.36], "packager_path": "/opt/bento4"}}' as src_json
)s
;
结果:
OK
1000231205
Time taken: 21.606 seconds, Fetched: 1 row(s)
答案 1 :(得分:0)
您可以在配置单元中使用regexp_extract函数,并使用匹配的正则表达式从content_id中仅提取数字。
示例:
select regexp_extract(col1,'"content_id":\\s"(\\d+)"',1) from (
select string('{"url": {"phone": "videos/hssportint/hssport/jocaasd/6_3818e20a9e/19098311205/phone", "tv": "/mnt/c81292786e1e368e12144c302007/output/", "sample_aspect_ratio": "1:1", "subsample": 25, "content_id": "1000231205", "encryption_enabled": false, "non_ad_time_intervals": [2330.68, 2898.36]], "packager_path": "/opt/bento4"}}], "vmaf_path": "/vmaf"}')col1
)t;
+-------------+--+
| _c0 |
+-------------+--+
| 1000231205 |
+-------------+--+
正则表达式说明:
"content_id":\\s"(\\d+)" //match literal "content_id": + any space + "digit inside quotes"
答案 2 :(得分:0)
通过正则表达式和子字符串函数的组合找到了一种昂贵的方法
substr(split(regexp_extract(message,'content_id([^&]*)'), '"')[3],1) as content_id