我迫切需要帮助,我试图在包含超过5000个pdf的文件夹目录中搜索文本字符串,代码经过测试并且使用的PDF文件少于100个它有效,但一旦达到极限,需要5-10分钟才能得出结果。非常感谢任何帮助:
'<%
'Search Text
Dim strtextToSearch
strtextToSearch = Request("TextToSearch")
'Now, we want to search all of the files
Dim fso
'Constant to read
Const ForReading = 1
Set fso = Server.CreateObject("Scripting.FileSystemObject")
'Specify the folder path to search.
Dim FolderToSearch
FolderToSearch = "C:\inetpub\site\Files\allpdfs\"
'Proceed if folder exists
if fso.FolderExists(FolderToSearch) then
Dim objFolder
Set objFolder = fso.GetFolder(FolderToSearch)
Dim objFile, objTextStream, strFileContents, bolFileFound
bolFileFound = False
Dim FilesCounter
FilesCounter = 0 'Total files found
For Each objFile in objFolder.Files
Set objTextStream = fso.OpenTextFile(objFile.Path,ForReading)
'Read the content
strFileContents = objTextStream.ReadAll
If InStr(1,strFileContents,strtextToSearch,1) then
'%>
<a href="http://go.to.mysite.com/files/allpdfs/<%Response.Write objFile.Name%>" target="_blank">
'<%
Response.Write objFile.Name & "</a><br>"
FilesCounter = FilesCounter + 1
End If
objTextStream.Close
Next
if FilesCounter = 0 then
Response.Write "Sorry, No matches found."
else
Response.Write "Total files found : " & FilesCounter
end if
'Destroy the objects
Set objTextStream = Nothing
Set objFolder = Nothing
else
Response.Write "Sorry, invalid folder name"
end if
Set fso = Nothing
%>
答案 0 :(得分:1)
每次进行全面搜索都需要永远。你最好使用像Solr这样的索引器来保持搜索引擎的索引并快速返回结果。
这是一个很好的起点。 http://wiki.apache.org/solr/