问题的最终目标或根源是在json_extract_path_text Redshift中具有与之兼容的字段。
这是现在的样子:
{'error': "Feed load failed: Parameter 'url' must be a string, not object", 'errorCode': 3, 'event_origin': 'app', 'screen_id': '6118964227874465', 'screen_class': 'Promotion'}
要从Redshift中的字符串中提取我需要的字段,我将单引号替换为双引号。 特定的记录给出了错误,因为错误的内部值中只有一个引号。这样,如果这些字符串也被替换,则该字符串将是无效的json。
所以我需要的是:
{"error": "Feed load failed: Parameter 'url' must be a string, not object", "errorCode": 3, "event_origin": "app", "screen_id": "6118964227874465", "screen_class": "Promotion"}
答案 0 :(得分:0)
有几种方法,一种是将regex
模块与
"[^"]*"(*SKIP)(*FAIL)|'
Python
中:
import regex as re
rx = re.compile(r'"[^"]*"(*SKIP)(*FAIL)|\'')
new_string = rx.sub('"', old_string)
使用原始的re
模块,您需要使用一个函数,然后查看组是否匹配-(*SKIP)(*FAIL)
可让您避免这种情况。
答案 1 :(得分:0)
我尝试过使用正则表达式的方法,但是发现它复杂而缓慢。因此,我编写了一个简单的“括号解析器”,用于跟踪当前的报价模式。它不能进行多个嵌套,您需要为此进行堆栈。对于我的用例,将str(dict)转换为正确的JSON即可:
示例输入:
{'cities': [{'name': "Upper Hell's Gate"}, {'name': "N'zeto"}]}
示例输出:
{"cities": [{"name": "Upper Hell's Gate"}, {"name": "N'zeto"}]}'
python单元测试
def testSingleToDoubleQuote(self):
jsonStr='''
{
"cities": [
{
"name": "Upper Hell's Gate"
},
{
"name": "N'zeto"
}
]
}
'''
listOfDicts=json.loads(jsonStr)
dictStr=str(listOfDicts)
if self.debug:
print(dictStr)
jsonStr2=JSONAble.singleQuoteToDoubleQuote(dictStr)
if self.debug:
print(jsonStr2)
self.assertEqual('''{"cities": [{"name": "Upper Hell's Gate"}, {"name": "N'zeto"}]}''',jsonStr2)
singleQuoteToDoubleQuote
def singleQuoteToDoubleQuote(singleQuoted):
'''
convert a single quoted string to a double quoted one
Args:
singleQuoted(string): a single quoted string e.g. {'cities': [{'name': "Upper Hell's Gate"}]}
Returns:
string: the double quoted version of the string e.g.
see
- https://stackoverflow.com/questions/55600788/python-replace-single-quotes-with-double-quotes-but-leave-ones-within-double-q
'''
cList=list(singleQuoted)
inDouble=False;
inSingle=False;
for i,c in enumerate(cList):
#print ("%d:%s %r %r" %(i,c,inSingle,inDouble))
if c=="'":
if not inDouble:
inSingle=not inSingle
cList[i]='"'
elif c=='"':
inDouble=not inDouble
doubleQuoted="".join(cList)
return doubleQuoted