索引时Solr变慢

时间:2012-04-18 11:37:19

标签: performance solr spell-checking

我有超过100个CSV文件,其中有10000行我正在编制索引。然后查询拼写是类似的拼写。虽然做这个索引非常慢。

我找到了一些很好的解决方案

  1. 使用主索引和从属进行查询的主从属。 How to index records in Solr faster (and not impact ColdFusion web server)? Two JVM?

  2. 使用三范围http://www.lucidimagination.com/blog/2009/05/13/exploring-lucene-and-solrs-trierange-capabilities/

  3. 我知道这两种解决方案不同我想要一些应该优先考虑的意见吗?第二种解决方案是否适合我的问题?如果我的拼写检查问题有更多的解决方案。

    提前致谢

2 个答案:

答案 0 :(得分:8)

索引通常会使查询变慢。如果你有快速磁盘,索引将使用100%的CPU,否则,它将使用100%的磁盘带宽。无论哪种方式,查询都会很慢。

主/从配置是此的标准解决方案。从属服务器专用于搜索查询。他们减速的唯一时间是在复制之后,当创建具有新缓存的新搜索者时。

主/从配置可能不会使索引更快,但它将避免慢查询性能。已经有关于对多线程建立索引的工作,因此您可能希望一次测试多个索引任务。如果瓶颈是磁盘IO,那么这将无济于事,只有当它使用100%的一个CPU时才会有用。

Trie字段非常适合范围查询。我怀疑它们对索引速度有多大影响。

最后,您可能想要调整拼写建议选项。拼写建议可以做很多工作,你可以用不同的,更便宜的选项获得好的结果。

答案 1 :(得分:1)

您通常可以在进行批量索引时获得良好的查询性能,而无需求助于蓝/绿设置。

以下是实现这一目标的一些提示:

  • 如果您要插入大量文档,请尽可能使用 https://github.com/lucidworks/spark-solr。它具有强大的批量导入机制,相对易于使用。如果没有必要,不要编写自己的批量导入 Solr 代码。
  • 如果必须使用 solrj,请确保分批提交。请参阅 add(Collection<SolrInputDocument) 方法。如果您用插入 http 请求淹没了 solr,它会大大降低查询速度。
  • 不要太频繁地提交。提交很昂贵!
  • 使用自动加热功能。这是一个 solr 功能,每次创建新搜索器时(提交后)都会导致文件系统缓存预热。通过将所有搜索保留在有效缓存中,这对于确保在提交后获得快速查询非常重要。
  • 如果您有很多分片(例如超过 20 个),请考虑采用“交错提交”方法,一次提交一个分片。这是一个 Python 示例,其中有 16 个 solr 服务器,总共有 64 个分片。通过一次只提交 1 个分片,它可以防止整个集合同时提交,从而减少提交的影响。
while :
do
  curl -v 'http://solr-0.solr:8983/solr/MyCollection_shard8_0_0_replica_n127/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-0.solr:8983/solr/MyCollection_shard8_0_1_replica_n128/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-0.solr:8983/solr/MyCollection_shard8_1_0_replica_n131/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-0.solr:8983/solr/MyCollection_shard8_1_1_replica_n132/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  
  curl -v 'http://solr-1.solr:8983/solr/MyCollection_shard4_0_0_replica_n95/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-1.solr:8983/solr/MyCollection_shard4_0_1_replica_n96/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-1.solr:8983/solr/MyCollection_shard4_1_0_replica_n99/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-1.solr:8983/solr/MyCollection_shard4_1_1_replica_n100/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-2.solr:8983/solr/MyCollection_shard3_0_0_replica_n87/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-2.solr:8983/solr/MyCollection_shard3_0_1_replica_n88/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-2.solr:8983/solr/MyCollection_shard3_1_0_replica_n91/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-2.solr:8983/solr/MyCollection_shard3_1_1_replica_n92/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-3.solr:8983/solr/MyCollection_shard7_0_0_replica_n119/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-3.solr:8983/solr/MyCollection_shard7_0_1_replica_n120/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-3.solr:8983/solr/MyCollection_shard7_1_0_replica_n123/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-3.solr:8983/solr/MyCollection_shard7_1_1_replica_n124/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-4.solr:8983/solr/MyCollection_shard2_0_0_replica_n79/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-4.solr:8983/solr/MyCollection_shard2_0_1_replica_n80/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-4.solr:8983/solr/MyCollection_shard2_1_0_replica_n83/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-4.solr:8983/solr/MyCollection_shard2_1_1_replica_n84/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-5.solr:8983/solr/MyCollection_shard1_0_0_replica_n71/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-5.solr:8983/solr/MyCollection_shard1_0_1_replica_n72/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-5.solr:8983/solr/MyCollection_shard1_1_0_replica_n75/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-5.solr:8983/solr/MyCollection_shard1_1_1_replica_n76/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-6.solr:8983/solr/MyCollection_shard5_0_0_replica_n159/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-6.solr:8983/solr/MyCollection_shard5_0_1_replica_n161/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-6.solr:8983/solr/MyCollection_shard5_1_0_replica_n163/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-6.solr:8983/solr/MyCollection_shard5_1_1_replica_n165/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-7.solr:8983/solr/MyCollection_shard6_0_0_replica_n151/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-7.solr:8983/solr/MyCollection_shard6_0_1_replica_n153/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-7.solr:8983/solr/MyCollection_shard6_1_0_replica_n155/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-7.solr:8983/solr/MyCollection_shard6_1_1_replica_n157/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-8.solr:8983/solr/MyCollection_shard8_0_0_replica_n135/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-8.solr:8983/solr/MyCollection_shard8_0_1_replica_n137/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-8.solr:8983/solr/MyCollection_shard8_1_0_replica_n139/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-8.solr:8983/solr/MyCollection_shard8_1_1_replica_n141/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-9.solr:8983/solr/MyCollection_shard4_0_0_replica_n143/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-9.solr:8983/solr/MyCollection_shard4_0_1_replica_n145/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-9.solr:8983/solr/MyCollection_shard4_1_0_replica_n147/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-9.solr:8983/solr/MyCollection_shard4_1_1_replica_n149/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-10.solr:8983/solr/MyCollection_shard6_0_0_replica_n111/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-10.solr:8983/solr/MyCollection_shard6_0_1_replica_n112/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-10.solr:8983/solr/MyCollection_shard6_1_0_replica_n115/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-10.solr:8983/solr/MyCollection_shard6_1_1_replica_n116/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-11.solr:8983/solr/MyCollection_shard5_0_0_replica_n103/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-11.solr:8983/solr/MyCollection_shard5_0_1_replica_n104/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-11.solr:8983/solr/MyCollection_shard5_1_0_replica_n107/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-11.solr:8983/solr/MyCollection_shard5_1_1_replica_n108/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-12.solr:8983/solr/MyCollection_shard2_0_0_replica_n167/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-12.solr:8983/solr/MyCollection_shard2_0_1_replica_n169/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-12.solr:8983/solr/MyCollection_shard2_1_0_replica_n171/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-12.solr:8983/solr/MyCollection_shard2_1_1_replica_n173/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-13.solr:8983/solr/MyCollection_shard1_0_0_replica_n175/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-13.solr:8983/solr/MyCollection_shard1_0_1_replica_n177/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-13.solr:8983/solr/MyCollection_shard1_1_0_replica_n179/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-13.solr:8983/solr/MyCollection_shard1_1_1_replica_n181/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-14.solr:8983/solr/MyCollection_shard3_0_0_replica_n183/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-14.solr:8983/solr/MyCollection_shard3_0_1_replica_n185/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-14.solr:8983/solr/MyCollection_shard3_1_0_replica_n187/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-14.solr:8983/solr/MyCollection_shard3_1_1_replica_n189/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  curl -v 'http://solr-15.solr:8983/solr/MyCollection_shard7_0_0_replica_n191/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-15.solr:8983/solr/MyCollection_shard7_0_1_replica_n193/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-15.solr:8983/solr/MyCollection_shard7_1_0_replica_n195/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'
  curl -v 'http://solr-15.solr:8983/solr/MyCollection_shard7_1_1_replica_n197/update?update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=blah&commit_end_point=true&version=2&expungeDeletes=false'

  # 15 minute commit timer
  sleep 900
done
  • 确保您的过滤器缓存足够大。这可以防止过滤消耗 CPU 和磁盘 IO,因此您在编制索引时会发现影响要小​​得多。
  • 考虑定期进行 Solr 优化。通过进行 solr 优化,更多的索引可能适合内存,从而导致更少的磁盘 IO。通过减少磁盘 IO,新文档不太可能需要访问磁盘,从而提高磁盘性能。
  • 如果 Solr 在共享硬件上,请确保其他正在运行的应用程序不会占用 CPU 和磁盘资源。例如,如果您使用 Apache Tika 解析文档,请确保 Tika 正在另一台主机上运行。 Solr 最好单独留下。
  • 确保 VM 上有足够的内存用于文件系统缓存。例如。如果你有一台 32G RAM 的机器,你应该考虑不要让你的最大堆大小太大。例如,如果您有 -Xmx28G,它不会为操作系统的文件系统缓存留出足够的内存。使用经验分析来确定您实际需要多少堆。例如 -Xmx12G 会将 50% 的内存留给 FS 缓存。您在查询期间使用缓存的次数越多,索引对您的影响就越小。