I have a set of documents with a structure like:
<DOCUMENT>
<AMOUNTS>
<ELEMENT>
<AMOUNT>10.00</AMOUNT>
<INFO>
<CODE1>132</CODE1>
<CODE2>022</CODE2>
</INFO>
</ELEMENT>
<ELEMENT>
<AMOUNT>10.00</AMOUNT>
<INFO>
<CODE1>132</CODE1>
<CODE2>121</CODE2>
</INFO>
</ELEMENT>
<ELEMENT>
<AMOUNT>15.00</AMOUNT>
<INFO>
<CODE1>156</CODE1>
<CODE2>121</CODE2>
</INFO>
</ELEMENT>
</AMOUNTS>
</DOCUMENT>
I’m looking to do various sums of the AMOUNT element so I’ve put a Path Range Index on the DOCUMENT/AMOUNTS/ELEMENT/AMOUNT
element hoping to use the sum-aggregate
function. However I’m seeing an issue when using the cts:sum-aggregate
function when the sum involves documents that contain more than one AMOUNT element with the same value. To illustrate the issue I’m seeing assume the XML above is stored at the '/DOCS/DOC1.XML
' uri. I then run the following xQuery to get the sum of all the AMOUNTs in the document. I'm doing the sum in two different ways and getting two different results:
(
fn:sum(doc('/DOCS/DOC1.XML')/DOCUMENT/AMOUNTS/ELEMENT/AMOUNT),
cts:sum-aggregate(
cts:path-reference("DOCUMENT/AMOUNTS/ELEMENT/AMOUNT"),
("any"),
cts:document-query('/DOCS/DOC1.XML')
)
)
The fn:sum
function gives 35 and the cts:sum-aggregate
gives 25. The sum-aggregate function is only including one of the 10 values in the sum.
I think I’m doing something wrong but I can’t figure out what, can someone shed some light on this for me?
Thanks
David
答案 0 :(得分:2)
After reading the answer from wst I confirmed that the type of my index was decimal and then played around with the options a bit and found that adding "item-frequency" as an option to the sum-aggregate function solved my issue. I don't completely understand the nuances between "item-frequency" and "fragment-frequency" in relation to the sum-aggregate function but the following xQuery works like I expect it to causing both sums to return the same value.
(
fn:sum(doc('/DOCS/DOC1.XML')/DOCUMENT/AMOUNTS/ELEMENT/AMOUNT),
cts:sum-aggregate(
cts:path-reference("DOCUMENT/AMOUNTS/ELEMENT/AMOUNT"),
("item-frequency"),
cts:document-query('/DOCS/DOC1.XML')
)
)
答案 1 :(得分:1)
Is your path index a string
type or a number (float
, double
, etc.) type? I wouldn't expect this to work at all with strings, but maybe it is, and I don't see you passing a option to set the type to a number (("any", "type=double")
).
String indexes combine identical (according to the collation) values into a single entry and increment the entry's cts:frequency
. If sum-aggregate
does work over string indexes (and I don't see anything in the documentation to suggest otherwise), that could explain why the duplicate value is only counted once.