Question

我使用以下Groovy代码加载存储在MongoDB中的文件，以便在Solr中进行索引。（我已经创建了一个包含文件内容和文件名的文件对象）：

ContentStreamUpdateRequest up = new ContentStreamUpdateRequest("/update/extract")

def tempFile = new File("temp/temp-${file.name}")
tempFile.append(file.file) //file.file references the byte[] of the file
//append call writes the file to disk

up.addFile(tempFile, "application/octet-stream")

up.setParam("literal.id", file.id.toString())
up.setParam("literal.name", "ConsultantFile")
up.setParam("literal.fileName_s", file.name)
up.setParam("literal.creator_s", file.createdBy?.lastFirstName)

up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true)

server.request(up) //server object is the shared handle to the Solr instance

tempFile.delete()

所以，我已经有了该文件的字节数组，但是我将它写入磁盘，所以我可以使用addFile方法。然后，作为清理，我删除磁盘上的文件。它有效，但它很愚蠢。

我尝试使用以下代码代替up.addFile（），但它会抛出“不正常状态：500消息：服务器错误”

def stringFile = new String(file.file, "UTF-8")
def stream = new ContentStreamBase.StringStream(stringFile)
up.addContentStream(stream)

将内存中已有的文件编入索引的最佳方法是什么，而不必将其作为中间步骤写入磁盘？

Answer 1

您可以检查允许您使用流的addContentStream方法。

此外，如果您有用于公开文件的Web界面，则可以检查Sending_documents_to_Solr

如何将数据库中的文件加载到Solr中

1 个答案: