我正在尝试解析以下具有递归层次结构的 XML。我只能循环一次,第二次调查数据永远不会被填充。另外,我在列中得到 NULL 值
<DATA_EXPORT>
<SURVEYDATA>
<SURVEY_ID>1</SURVEY_ID>
<CLIENT_ID>ABC</CLIENT_ID>
<COMMENTS>
<RESPONSE>
<QUESTION>Do you drink?</QUESTION>
<ANSWER>Yes</ANSWER>
</RESPONSE>
</COMMENTS>
<COMMENTS>
<RESPONSE>
<QUESTION>Do you Smoke?</QUESTION>
<ANSWER>Yes</ANSWER>
</RESPONSE>
</COMMENTS>
</SURVEYDATA>
<SURVEYDATA>
<SURVEY_ID>2</SURVEY_ID>
<CLIENT_ID>DEF</CLIENT_ID>
<COMMENTS>
<RESPONSE>
<QUESTION>Do you drink?</QUESTION>
<ANSWER>No</ANSWER>
</RESPONSE>
</COMMENTS>
</SURVEYDATA>
</DATA_EXPORT>
使用的查询:
SELECT
GET(XMLGET(XMLGET(TEST_XML_1, 'SURVEYDATA'),'SURVEY_ID'), '$') AS SURVEY_ID,
GET(XMLGET(D.VALUE, 'QUESTION'), '$') AS QUESTION,
GET(XMLGET(D.VALUE, 'ANSWER'), '$') AS ANSWER
FROM DATA,
LATERAL FLATTEN (GET(XMLGET(TEST_XML_1, 'SURVEYDATA', 0), '$'))D;
我得到的输出是:
SURVEY_ID | 问题 | 答案 |
---|---|---|
1 | NULL | NULL |
1 | NULL | NULL |
1 | NULL | NULL |
1 | NULL | NULL |
我期望的输出是:
SURVEY_ID | 问题 | 答案 |
---|---|---|
1 | 你喝酒吗? | 是 |
1 | 你抽烟吗? | 是 |
2 | 你喝酒吗? | 没有 |
答案 0 :(得分:1)
所以看起来你现在想遍历 DATA_EXPORT
中的对象,你需要得到那个对象,GET(xml, '$')
会给你,因此下面会给你两行 {{ 1}}
SURVEYDATA
假设您需要survey_id和cleint_id,现在让我们将这些加上嵌套的评论拉出来,这样我们就可以看到我们正在获取我们想要的数据:
SELECT q.*
FROM TEST_XML,
LATERAL FLATTEN(GET(src_xml, '$')) q;
但我们注意到这只有一个评论,所以需要循环而不是跨评论,而是实际上跨 SURVEYDATA 的对象,但只保留评论:
SELECT
get(XMLGET(q.value, 'SURVEY_ID'), '$') as survey_id
,get(XMLGET(q.value, 'CLIENT_ID'), '$') as client_id
,XMLGET(q.value, 'COMMENTS') as comments
FROM TEST_XML,
LATERAL FLATTEN(GET(src_xml, '$')) q;
现在我们可以解压我们想要的评论值:
SELECT
get(XMLGET(q.value, 'SURVEY_ID'), '$') as survey_id
,get(XMLGET(q.value, 'CLIENT_ID'), '$') as client_id
,XMLGET(q.value, 'COMMENTS') as comments
,get(q.value, '$')
,c.*
FROM TEST_XML,
LATERAL FLATTEN(GET(src_xml, '$')) q,
LATERAL FLATTEN(get(q.value, '$')) c
WHERE get(c.value, '@')='COMMENTS'
所以现在我们可以看到我们拥有我们想要的所有值,我们可以稍微压缩 SQL,这样它就没有我们用来帮助我们解决问题的中间值。
给出最终的 SQL,在 CTE 中包含数据以帮助测试:
SELECT
get(XMLGET(q.value, 'SURVEY_ID'), '$') as survey_id
,get(XMLGET(q.value, 'CLIENT_ID'), '$') as client_id
,c.value
,XMLGET(c.value, 'RESPONSE') as resp
,get(XMLGET(resp, 'QUESTION'), '$') as question
,get(XMLGET(resp, 'ANSWER'), '$' ) as answer
FROM TEST_XML,
LATERAL FLATTEN(GET(src_xml, '$')) q,
LATERAL FLATTEN(get(q.value, '$')) c
WHERE get(c.value, '@')='COMMENTS'
给出结果:
with TEST_XML as (
select parse_xml('<DATA_EXPORT>
<SURVEYDATA>
<SURVEY_ID>1</SURVEY_ID>
<CLIENT_ID>ABC</CLIENT_ID>
<COMMENTS>
<RESPONSE>
<QUESTION>Do you drink?</QUESTION>
<ANSWER>Yes</ANSWER>
</RESPONSE>
</COMMENTS>
<COMMENTS>
<RESPONSE>
<QUESTION>Do you Smoke?</QUESTION>
<ANSWER>Yes</ANSWER>
</RESPONSE>
</COMMENTS>
</SURVEYDATA>
<SURVEYDATA>
<SURVEY_ID>2</SURVEY_ID>
<CLIENT_ID>DEF</CLIENT_ID>
<COMMENTS>
<RESPONSE>
<QUESTION>Do you drink?</QUESTION>
<ANSWER>No</ANSWER>
</RESPONSE>
</COMMENTS>
</SURVEYDATA>
</DATA_EXPORT>') as SRC_XML
)
SELECT
get(XMLGET(q.value, 'SURVEY_ID'), '$') as survey_id
,get(XMLGET(q.value, 'CLIENT_ID'), '$') as client_id
,get(XMLGET(XMLGET(c.value, 'RESPONSE'), 'QUESTION'), '$') as question
,get(XMLGET(XMLGET(c.value, 'RESPONSE'), 'ANSWER'), '$' ) as answer
FROM TEST_XML,
LATERAL FLATTEN(GET(src_xml, '$')) q,
LATERAL FLATTEN(get(q.value, '$')) c
WHERE get(c.value, '@')='COMMENTS'