Sitecore 7 ContentSearch抓取失败:" Crawler:AddRecursive DoItemAdd失败"

时间:2014-03-24 19:03:13

标签: lucene sitecore sitecore7

当我们尝试重建我们的Lucene(ContentSearch)索引时,我们的CrawlingLog充满了以下异常:

7052 15:08:21 WARN  Crawler : AddRecursive DoItemAdd failed - {5A1E50E4-46B9-42D5-B743-1ED10D15D47E}
Exception: System.AggregateException
Message: One or more errors occurred.
Source: mscorlib
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at System.Threading.Tasks.Parallel.PartitionerForEachWorker[TSource,TLocal](Partitioner`1 source, ParallelOptions parallelOptions, Action`1 simpleBody, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
   at System.Threading.Tasks.Parallel.ForEachWorker[TSource,TLocal](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
   at System.Threading.Tasks.Parallel.ForEach[TSource](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body)
   at Sitecore.ContentSearch.AbstractDocumentBuilder`1.AddItemFields()
   at Sitecore.ContentSearch.LuceneProvider.CrawlerLuceneIndexOperations.GetIndexData(IIndexable indexable, IProviderUpdateContext context)
   at Sitecore.ContentSearch.LuceneProvider.CrawlerLuceneIndexOperations.BuildDataToIndex(IProviderUpdateContext context, IIndexable version)
   at Sitecore.ContentSearch.LuceneProvider.CrawlerLuceneIndexOperations.Add(IIndexable indexable, IProviderUpdateContext context, ProviderIndexConfiguration indexConfiguration)
   at Sitecore.ContentSearch.SitecoreItemCrawler.DoAdd(IProviderUpdateContext context, SitecoreIndexableItem indexable)
   at Sitecore.ContentSearch.HierarchicalDataCrawler`1.CrawlItem(Tuple`3 tuple)
Nested Exception
Exception: System.ArgumentOutOfRangeException
Message: Index and length must refer to a location within the string.
Parameter name: length
Source: mscorlib
   at System.String.InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy)
   at Sitecore.Data.ShortID.Encode(String guid)
   at Sitecore.ContentSearch.FieldReaders.MultiListFieldReader.GetFieldValue(IIndexableDataField indexableField)
   at Sitecore.ContentSearch.FieldReaders.FieldReaderMap.GetFieldValue(IIndexableDataField field)
   at Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilder.AddField(IIndexableDataField field)
   at System.Threading.Tasks.Parallel.<>c__DisplayClass32`2.<PartitionerForEachWorker>b__30()
   at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
   at System.Threading.Tasks.Task.<>c__DisplayClass11.<ExecuteSelfReplicating>b__10(Object param0)

这似乎是由ShortID.Encode(string)方法引起的,该方法期望字符串参数中的GUID具有括号(&#34; {&#34;和&#34;}&#34;)。我们的一些多列表字段关系使用Guid.ToString()以编程方式关联,其中不包括括号。不幸的是,这些值导致ShortID.Encode()方法窒息。

1 个答案:

答案 0 :(得分:0)

  1. 首先要做的事情:找到您拨打MultiListField.Add(string)的所有地点,然后将Guid.ToString()更改为Guid.ToString("B")。这将解决所有新关系的问题。
  2. 创建一个自定义FieldReader类来替换标准MultiListFieldReader(我们称之为CustomMultiListFieldReader)。
  3. 将自定义类设置为从Sitecore.ContentSearch.FieldReaders.FieldReader继承。
  4. Sitecore.ContentSearch.FieldReaders.MultiListFieldReader.GetFieldValue(IIndexableDataField)方法反编译到自定义类中。
  5. if (ID.IsID(id))行之前,添加以下代码:

    if (!str.StartsWith("{") && !str.EndsWith("}"))
        id = String.Format("{{{0}}}", str);
    
  6. 在索引配置中(我们将其添加到默认值Sitecore.ContentSearch.DefaultIndexConfiguration.config),将MultiList字段的fieldReaderType更改为自定义类型。 (这可以在sitecore / contentSearch / configuration / defaultIndexConfiguration / fieldReaders / mapFieldByTypeName / fieldReader的配置中找到。)

  7. 完全披露:我不喜欢这种方法,因为如果MultiListFieldReader的默认实现发生了变化,我们就会没有这些变化。但是这允许将项目包含在索引中,而无需重新格式化每个多列表字段中的所有GUID。