(代表雪花用户提交)
使用:
<clinical_study>
<!-- This xml conforms to an XML Schema at:
https://clinicaltrials.gov/ct2/html/images/info/public.xsd -->
<required_header>
<download_date>ClinicalTrials.gov processed this data on September 13, 2019</download_date>
<link_text>Link to the current ClinicalTrials.gov record.</link_text>
<url>https://clinicaltrials.gov/show/NCT00010010</url>
</required_header>
<id_info>
<org_study_id>CDR0000068431</org_study_id>
<secondary_id>NYU-0004</secondary_id>
<secondary_id>P-UPJOHN-NYU-0004</secondary_id>
<secondary_id>NCI-G00-1906</seco
我正在获取null而不是获取根元素的内容。我已经阅读了Snowflake文档中的“ How to Easily Load and Query XML Data with Snowflake Part 2”,并且正在使用:
SELECT XMLGET(src_xml, 'clinical_study'):"$",
*
FROM STG_XML
;
...但是它给了我NULL,因为我试图使用上述SQL来获取根元素的内容。
有什么想法,建议和/或解决方法吗?
答案 0 :(得分:2)
正如Mike Walton所说,XML是不完整的(这使其他人无法轻易重现OP正在询问的NULL)。如果我们关闭打开的XML元素,则XMLGET中的NULL问题是“ clinical_study”是根节点。...XMLGET检索内部内的元素。为了返回根节点的内容,可以使用以下表达式:
src_xml:"$" AS clinical_study_contents
这里有一个简单的测试工具来演示这一点,以及XMLGET的有效使用(提取“ id_info”元素的内容):
WITH STG_XML AS (
SELECT PARSE_XML($1) AS src_xml
FROM VALUES
($$
<clinical_study>
<!-- This xml conforms to an XML Schema at:
https://clinicaltrials.gov/ct2/html/images/info/public.xsd -->
<required_header>
<download_date>ClinicalTrials.gov processed this data on September 13, 2019</download_date>
<link_text>Link to the current ClinicalTrials.gov record.</link_text>
<url>https://clinicaltrials.gov/show/NCT00010010</url>
</required_header>
<id_info>
<org_study_id>CDR0000068431</org_study_id>
<secondary_id>NYU-0004</secondary_id>
<secondary_id>P-UPJOHN-NYU-0004</secondary_id>
<secondary_id>NCI-G00-1906</secondary_id>
</id_info>
</clinical_study>
$$)
)
SELECT src_xml:"$" AS clinical_study_contents
,XMLGET(src_xml, 'id_info') as id_info_element
,*
FROM STG_XML
;
答案 1 :(得分:0)
Here is the Good Blog :
https://community.snowflake.com/s/article/Querying-Nested-XML-in-Snowflake
Also , PFB way to query nested XML elements.
Sample XML :
<?xml version="1.0"?>
<comtec version="2008">
<customer_transport_order>
<id>2880ORO</id>
<order_number>99833104701</order_number>
<priority>0</priority>
<order_date>2019-03-22</order_date>
<order_kind>
<code>VMI</code>
<name>VMI</name>
</order_kind>
<operational>true</operational>
<order_status>
<code>cancel</code>
<name>cancel</name>
<status_kind>cancel</status_kind>
</order_status>
<contact>
<id>CEN143096</id>
<code>CEN127431</code>
<name>SOUTHERN UNITED ENTERPRISES</name>
</contact>
</customer_transport_order>
</comtec>
Sample Query:
select
XMLGET( cust.value, 'order_number' ):"$"::integer as cust_order,
XMLGET( cust.value, 'order_date' ):"$"::string as cust_date,
XMLGET( orderkind.value, 'code' ):"$"::string as order_kind,
XMLGET( contactval.value, 'id' ):"$"::string as contactval,
XMLGET( contactval.value, 'code' ):"$"::string as contactcode,
XMLGET( contactval.value, 'name' ):"$"::string as contactname
from
dept_emp_addr
, lateral FLATTEN(dept_emp_addr.xmldata:"$") cust
, lateral FLATTEN(cust.value:"$") orderkind
, lateral FLATTEN(cust.value:"$") contactval
where cust.value like '<customer_transport_order>%' AND orderkind.value like '<order_kind>%'
AND contactval.value like '<contact>%'
ORDER BY cust_order;
[1]: https://community.snowflake.com/s/article/Querying-Nested-XML-in-Snowflake