我们假设我们有 n 记录。我想计算每条记录和所有其他记录之间的相似性。我想制作一个相似矩阵。我是XQuery的新手,但我正在努力。我附上了一对屏幕截图,显示了一对记录之间的相似性。
这是一个csv字符串。我使用以下for循环来生成此示例:
for $item1 at $index in /rec:Record
let $records:= /rec:Record
for $item2 in $records[$index + 1]
(: here I call the similarity functions :)
return
(: csv output :)
我需要编辑for循环以在数据集中的每对记录之间生成相似性矩阵。怎么做??
注意:相似度函数准备就绪,我的问题是不计算相似度本身。
答案 0 :(得分:2)
编辑:将CSV输出添加为文本节点结束:
考虑MarkLogic中地图的强大功能。
以ML表示矩阵的样本如下。我也搞了两件事:一个函数作为你的公式的占位符(包括传递原始序列,以防你需要全部用于分析)以及一个小函数来显示如何访问地图的地图。
xquery version "1.0-ml";
declare function local:csv($matrix){
let $nl := " "
return text{
for $x in map:keys($matrix)
let $row := map:get($matrix, $x)
order by xs:int($x)
return fn:string-join(for $y in map:keys($row)
order by xs:int($y)
return xs:string(map:get($row, $y))
, ",") || $nl
}
};
declare function local:my-formula($x, $y, $seq){
let $foo := "do something"
return "your-formula for " || xs:string($x) || " and " || xs:string($y)
};
declare function local:pretty($matrix){
<matrix>
{
for $x in map:keys($matrix)
order by xs:int($x)
return <row>
{
let $row := map:get($matrix, $x)
for $y in map:keys($row)
order by xs:int($y)
return <cell x="{$x}" y="{$y}">{map:get($row, $y)}</cell>
}
</row>
}
</matrix>
};
let $matrix := map:map()
let $numbers := "1,2,3,4,5,5,6,7,8"
let $seq := fn:tokenize($numbers, ",")
let $_ := for $x in $seq
let $map := map:map()
let $_ := for $y in $seq
return map:put($map, $y, local:my-formula($x, $y, $seq))
return map:put($matrix, $x, $map)
return local:pretty($matrix)
你可以直接转出地图地图($ matrix)。但是,local:pretty函数返回一种格式,以便您轻松查看地图的构造:
<matrix>
<row>
<cell x="1" y="1">your-formula for 1 and 1</cell>
<cell x="1" y="2">your-formula for 1 and 2</cell>
<cell x="1" y="3">your-formula for 1 and 3</cell>
<cell x="1" y="4">your-formula for 1 and 4</cell>
<cell x="1" y="5">your-formula for 1 and 5</cell>
<cell x="1" y="6">your-formula for 1 and 6</cell>
<cell x="1" y="7">your-formula for 1 and 7</cell>
<cell x="1" y="8">your-formula for 1 and 8</cell>
</row>
<row>
<cell x="2" y="1">your-formula for 2 and 1</cell>
<cell x="2" y="2">your-formula for 2 and 2</cell>
<cell x="2" y="3">your-formula for 2 and 3</cell>
<cell x="2" y="4">your-formula for 2 and 4</cell>
<cell x="2" y="5">your-formula for 2 and 5</cell>
<cell x="2" y="6">your-formula for 2 and 6</cell>
<cell x="2" y="7">your-formula for 2 and 7</cell>
<cell x="2" y="8">your-formula for 2 and 8</cell>
</row>
<row>
<cell x="3" y="1">your-formula for 3 and 1</cell>
<cell x="3" y="2">your-formula for 3 and 2</cell>
<cell x="3" y="3">your-formula for 3 and 3</cell>
<cell x="3" y="4">your-formula for 3 and 4</cell>
<cell x="3" y="5">your-formula for 3 and 5</cell>
<cell x="3" y="6">your-formula for 3 and 6</cell>
<cell x="3" y="7">your-formula for 3 and 7</cell>
<cell x="3" y="8">your-formula for 3 and 8</cell>
</row>
<row>
<cell x="4" y="1">your-formula for 4 and 1</cell>
<cell x="4" y="2">your-formula for 4 and 2</cell>
<cell x="4" y="3">your-formula for 4 and 3</cell>
<cell x="4" y="4">your-formula for 4 and 4</cell>
<cell x="4" y="5">your-formula for 4 and 5</cell>
<cell x="4" y="6">your-formula for 4 and 6</cell>
<cell x="4" y="7">your-formula for 4 and 7</cell>
<cell x="4" y="8">your-formula for 4 and 8</cell>
</row>
<row>
<cell x="5" y="1">your-formula for 5 and 1</cell>
<cell x="5" y="2">your-formula for 5 and 2</cell>
<cell x="5" y="3">your-formula for 5 and 3</cell>
<cell x="5" y="4">your-formula for 5 and 4</cell>
<cell x="5" y="5">your-formula for 5 and 5</cell>
<cell x="5" y="6">your-formula for 5 and 6</cell>
<cell x="5" y="7">your-formula for 5 and 7</cell>
<cell x="5" y="8">your-formula for 5 and 8</cell>
</row>
<row>
<cell x="6" y="1">your-formula for 6 and 1</cell>
<cell x="6" y="2">your-formula for 6 and 2</cell>
<cell x="6" y="3">your-formula for 6 and 3</cell>
<cell x="6" y="4">your-formula for 6 and 4</cell>
<cell x="6" y="5">your-formula for 6 and 5</cell>
<cell x="6" y="6">your-formula for 6 and 6</cell>
<cell x="6" y="7">your-formula for 6 and 7</cell>
<cell x="6" y="8">your-formula for 6 and 8</cell>
</row>
<row>
<cell x="7" y="1">your-formula for 7 and 1</cell>
<cell x="7" y="2">your-formula for 7 and 2</cell>
<cell x="7" y="3">your-formula for 7 and 3</cell>
<cell x="7" y="4">your-formula for 7 and 4</cell>
<cell x="7" y="5">your-formula for 7 and 5</cell>
<cell x="7" y="6">your-formula for 7 and 6</cell>
<cell x="7" y="7">your-formula for 7 and 7</cell>
<cell x="7" y="8">your-formula for 7 and 8</cell>
</row>
<row>
<cell x="8" y="1">your-formula for 8 and 1</cell>
<cell x="8" y="2">your-formula for 8 and 2</cell>
<cell x="8" y="3">your-formula for 8 and 3</cell>
<cell x="8" y="4">your-formula for 8 and 4</cell>
<cell x="8" y="5">your-formula for 8 and 5</cell>
<cell x="8" y="6">your-formula for 8 and 6</cell>
<cell x="8" y="7">your-formula for 8 and 7</cell>
<cell x="8" y="8">your-formula for 8 and 8</cell>
</row>
</matrix>
对于CSV,有一个名为local:csv的示例函数,它创建一个文本节点,结果如下:
your-formula for 1 and 1,your-formula for 1 and 2,your-formula for 1 and 3,your-formula for 1 and 4,your-formula for 1 and 5,your-formula for 1 and 6,your-formula for 1 and 7,your-formula for 1 and 8
your-formula for 2 and 1,your-formula for 2 and 2,your-formula for 2 and 3,your-formula for 2 and 4,your-formula for 2 and 5,your-formula for 2 and 6,your-formula for 2 and 7,your-formula for 2 and 8
your-formula for 3 and 1,your-formula for 3 and 2,your-formula for 3 and 3,your-formula for 3 and 4,your-formula for 3 and 5,your-formula for 3 and 6,your-formula for 3 and 7,your-formula for 3 and 8
your-formula for 4 and 1,your-formula for 4 and 2,your-formula for 4 and 3,your-formula for 4 and 4,your-formula for 4 and 5,your-formula for 4 and 6,your-formula for 4 and 7,your-formula for 4 and 8
your-formula for 5 and 1,your-formula for 5 and 2,your-formula for 5 and 3,your-formula for 5 and 4,your-formula for 5 and 5,your-formula for 5 and 6,your-formula for 5 and 7,your-formula for 5 and 8
your-formula for 6 and 1,your-formula for 6 and 2,your-formula for 6 and 3,your-formula for 6 and 4,your-formula for 6 and 5,your-formula for 6 and 6,your-formula for 6 and 7,your-formula for 6 and 8
your-formula for 7 and 1,your-formula for 7 and 2,your-formula for 7 and 3,your-formula for 7 and 4,your-formula for 7 and 5,your-formula for 7 and 6,your-formula for 7 and 7,your-formula for 7 and 8
your-formula for 8 and 1,your-formula for 8 and 2,your-formula for 8 and 3,your-formula for 8 and 4,your-formula for 8 and 5,your-formula for 8 and 6,your-formula for 8 and 7,your-formula for 8 and 8
答案 1 :(得分:1)
你可能会做这样的事情。我不确定你的csv是什么样的,或者你的解析器如何加载它。我还嘲笑了你表示你已经完成的某种功能。
declare function local:somefn ($listA as xs:integer*, $listB as xs:integer*) xs:string { "6,7,10,3" };
let $data :=
<csv>
<row>1,1,1</row>
<row>2,2,2</row>
<row>3,3,3</row>
<row>4,4,4</row>
</csv>
for $row1 at $pos in $data/row
for $row2 in $data/row[ position() > $pos ]
let $x := local:somefn($row1, $row2)
return $x
在baseX中产生:
6,7,10,3
6,7,10,3
6,7,10,3
6,7,10,3
6,7,10,3
6,7,10,3