Question

情景：

我有以下（简化）数据库表方案：

ID   ProductName          ProductCategory   Colour   Price
----------------------------------------------------------
1    BatmanTShirt         T-Shirt           Black    22
2    BatmanTShirt         T-Shirt           Blue     20
3    SupermanTShirt       T-Shirt           Blue     19
4    SpidermanTrousers    Trousers          Red      28
5    SpidermanTrousers    Trousers          Black    30

我的愿望：

在SOLR索引中，我希望以规范化的方式映射这些数据，这样只会创建3个SOLR文档（如下所示），而不是5个。

<doc1>
  <ID>1</ID>
  <ProductName>BatmanTShirt</ProductName>
  <ProductCategory>T-Shirt</ProductCategory>
  <OtherDetails>{ {1, Black, 22}, {2, Blue, 20} }</OtherDetails>
</doc1>
<doc2>
  <ID>3</ID>
  <ProductName>SupermanTShirt</ProductName>
  <ProductCategory>T-Shirt</ProductCategory>
  <OtherDetails>{ {3, Blue, 19} }</OtherDetails>
</doc2>
<doc3>
  <ID>4</ID>
  <ProductName>SpidermanTrousers</ProductName>
  <ProductCategory>Trousers</ProductCategory>
  <OtherDetails>{ {4, Red, 28}, {5, black, 30} }</OtherDetails>
</doc3>

一些注释：

<ID>将包含组中的最低ID
<OtherDetails>将包含唯一ID以及分组时遗漏的其他详细信息。这将是一个多值字段，其数据类型为List，其中包含另一个详细信息列表{ID，Color，Price}。

问题：

任何人都知道这怎么可能？

P.S。

进行此“分组”移动的原因是我想要在ProductCategory上进行分析。如果我在ProductCategory上使用faceting，则当前生成的计数将为：

T-Shirt (3)
Trousers (2)

现在我想要的是在没有颜色和价格数据的ProductCategory上，我想只有2件T恤（蝙蝠侠和超人之一）和只有1件裤子（蜘蛛侠）。因此，我想表明的是：

T-Shirt (2)
Trousers (1)

我做了一些研究，发现这个功能（称为组后分组或矩阵计数）目前是WIP，如this SOLR patch中所述。所以我想要一个临时的解决方法，因为这可能需要一段时间才能完成。

Answer 1

该修补程序适用于单值字段，因此使用此修补程序和分组是最佳方法。

只需将数据索引为数据库中的数据，因此您无需使用多值字段。

您可以使用TortoiseSVN下载latest code并应用补丁。在Eclipse中构建WAR（或JAR）非常简单。只需使用刚下载的代码启动新项目，然后在root和solr目录的build.xml中运行ant脚本。

SOLR多值字段

1 个答案: