我有一个像这样的XML文件(test.xml):
<?xml version="1.0" encoding="ISO-8859-1"?>
<s2xResponse>
<s2xData>
<Name>This is the name</Name>
<InfocomData>
<DateOfUpdate day="07" month="02" year="2018">20180207</DateOfUpdate>
<CompanyName>MY COMPANY</CompanyName>
<TaxCode FlagCheck="0">XXXYYYWWWZZZ</TaxCode>
</InfocomData>
<AssessmentSummary>
<Rating Code="2">Rating Description for Code 2</Rating>
</AssessmentSummary>
<AssessmentData>
<SectorialDistribution>
<CompaniesNumber>11650</CompaniesNumber>
<ScoreDistribution />
<CervedScoreDistribution>
<DistributionData>
<Rating Code="1">SICUREZZA</Rating>
<Percentage>1.91</Percentage>
</DistributionData>
<DistributionData>
<Rating Code="2">SOLVIBILITA' ELEVATA</Rating>
<Percentage>35.56</Percentage>
</DistributionData>
</CervedScoreDistribution>
</SectorialDistribution>
</AssessmentData>
</s2xData>
</s2xResponse>
我试图获得&#34;姓名&#34;节点文本(&#34;这是名称&#34;),使用XmlExtractor的U-SQL脚本。以下是我使用的代码:
USE TestXML; // It contains the registered assembly
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
@xml = EXTRACT xml_text string
FROM "textxpath/test.xml"
USING Extractors.Text(rowDelimiter: "^", quoting: false);
@xml_cleaned =
SELECT
xml_text.Replace("\r\n", "").Replace("\t", " ") AS xml_text
FROM @xml;
@values =
SELECT Microsoft.Analytics.Samples.Formats.Xml.XPath.Evaluate(xml_text, "s2xResponse/s2xData/Name")[1] AS value
FROM @xml_cleaned;
OUTPUT @values TO @"outputs/test_xpath.txt" USING Outputters.Text(quoting: false);
但是我收到了这个运行时错误:
执行失败,出现错误&#39; 1_SV1_Extract错误: &#39; {&#34; diagnosticCode&#34;:195887116&#34;严重性&#34;:&#34;错误&#34;&#34;组分&#34;:&#34; RUNTIME&#34 ;,&#34;源&#34;:&#34;用户&#34;&#34; ErrorID中&#34;:&#34; E_RUNTIME_USER_EXPRESSIONEVALUATION&#34;&#34;消息&#34;:&#34 ;错误 在评估表达时 Microsoft.Analytics.Samples.Formats.Xml.XPath.Evaluate(xml_text.Replace(\&#34; \ r \ n \&#34 ;, \&#34; \&#34;)。替换(\&#34; \ t \&#34;,\&#34; \&#34;), \&#34; s2xResponse / s2xData / Name \&#34;)[1]&#34;,&#34; description&#34;:&#34;内部异常来自 用户表达式:索引超出范围。必须是非负的和更少的 而不是集合的大小。
即使我为评估结果([0])使用零索引,我也会得到相同的错误。
我的查询有什么问题?
答案 0 :(得分:2)
这里的问题是您将下标[1]
应用于XPath.Evaluate
的结果,我相信这将返回Name
个节点。但是,您在代码中应用[1]
下标,而不是在XPath中,因此下标可能基于零,而不是基于1,因为它在XPath中,因此Index out of range
错误。 / p>
这是一个解决方案 - 只需在Xpath中应用下标运算符(它仍然是从1开始的),然后在那里选择text()
.Evaluate("s2xResponse/s2xData/Name[1]/text()")
答案 1 :(得分:1)
您是否有特殊原因要使用XmlDomExtractor
方法?我使用REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
DECLARE @inputFile string = "/input/input100.xml";
@input =
EXTRACT Name string
FROM @inputFile
USING new Microsoft.Analytics.Samples.Formats.Xml.XmlDomExtractor(rowPath : "/s2xResponse",
columnPaths : new SQL.MAP<string, string>{
{ "s2xData/Name", "Name" },
}
);
@output =
SELECT *
FROM @input;
让他工作,这将允许您从xml中提取多个值,例如
select columndatatype from sys.syscolumns
where referenceid = (
select tableid from sys.systables
where tablename = 'YOUR_TABEL_NAME'
and columnname= 'YOUR_COLUMN_NAME')