我正在尝试创建一个脚本以将PDF转换为纯文本,然后将纯文本复制到Word中。 (我们从头开始重新格式化腐败文档。)我有一个完美的脚本,除了一件事:当粘贴到Word时,它不会粘贴整个文件。对于较长的文件,我只会获得部分文本。
'string to hold file path
Dim strDMM
strDMM = "[path]"
'make this directory if it doesn't exits
On Error Resume Next
MkDir strDMM
On Error GoTo 0
'get the file name to process
Dim TheFile
TheFile = InputBox("What is the file name?" & chr(13) & chr(13) & "(Example: [name].pdf)", "Name of File")
'declare some acrobat variables
Dim AcroXApp
Dim AcroXAVDoc
Dim AcroXPDDoc
'open acrobat
Set AcroXApp = CreateObject("AcroExch.App")
AcroXApp.Hide
'open the document we want
Set AcroXAVDoc = CreateObject("AcroExch.AVDoc")
AcroXAVDoc.Open "[path to desktop]" & TheFile, "Acrobat" 'users are instructed to save to the Desktop for ease of access here
'make sure the acrobat window is active
AcroXAVDoc.BringToFront
'I don't know what this does. I copied it from code online.
Set AcroXPDDoc = AcroXAVDoc.GetPDDoc
'activate JavaScript commands w/Acrobat
Dim jsObj
Set jsObj = AcroXPDDoc.GetJSObject
'save the file as plain text
jsObj.SaveAs strDMM & "pdf-plain-text.txt", "com.adobe.acrobat.plain-text"
'close the file and exit acrobat
AcroXAVDoc.Close False
AcroXApp.Hide
AcroXApp.Exit
'declare constants for manipulating the text files
Const ForReading = 1
Const ForWriting = 2
'Create a File System Object
Dim objFSO
Set objFSO = CreateObject("Scripting.FileSystemObject")
'read file and get text
dim objFile
set objFile=objFSO.OpenTextFile( strDMM & TheFile, ForReading)
Dim strText
strText=objFile.ReadAll
'Create a Word Object
Dim objWord
set objWord = CreateObject("Word.Application")
'make Word visible
With objWord
.Visible = True
End With
'Add method used to create a blank document
Dim objDoc
Set objDoc=objWord.Documents.Add()
'create a shorter variable to pass commands to Word
Dim objSelection
set objSelection=objWord.Selection
'type the read text into Word; this is the part that's failing
objSelection.TypeText strText
objFile.Close
我尝试了多个具有相同结果的文件。有趣的是,它每次从文件A粘贴相同的材料,但是当从文件B复制时,它会粘贴不同数量的材料。换句话说,如果A在第一次运行时给我8页60,那么我每次都会得到8页。文件B可能会给我14页60,然后它每次给我相同的14页。仅当我从.txt文件中删除材料时,这才会更改。如果我从A删除几个段落,然后运行脚本,我可能会得到12页。然后我每次都得到同样的12。但是,没有任何模式(我可以辨别)来预测它被切断的地方。
我找不到任何EOF字符,当我从记事本中读取并写入记事本时,整个事情被完美复制。问题出在转移到Word的某个地方。
有什么我想念的吗? Word可以使用TypeText写入的字符串大小是否有限制? (我认为如果是这样的话,我就不会得到不同长度的文件,对吧?如果这是限制的话,它们不应该都停在n个字符上吗?)
我已经阅读过让VBS与剪贴板配合使用的其他库,但我是一个总菜鸟,不知道这是一个更优雅的解决方案,还是如何让它工作。我也不确定在我的工作计算机上我是否有必要的权限来安装这些库。
感谢任何帮助!
答案 0 :(得分:4)
无需将文件读入Word,您可以从磁盘插入文本文件
Dim objWord
'Dim objDoc
Set objWord = CreateObject("Word.Application")
'make Word visible
With objWord
.Visible = True
'Add method used to create a blank document
.Documents.Add
.Selection.InsertFile FileNameAndPath
End With
答案 1 :(得分:1)
您提到的基本问题是 String 数据类型仅限于65,400 characters。如果文件长度未知,最好一次读取一行并将其写入Word。对类似的here进行了很好的讨论。以下代码可以帮助您实现目标:
'read file and get text
dim objFile
set objFile=objFSO.OpenTextFile( strDMM & TheFile, ForReading)
'Don't do this!
'Dim strText
'strText=objFile.ReadAll
'Create a Word Object
Dim objWord
set objWord = CreateObject("Word.Application")
'make Word visible
With objWord
.Visible = True
End With
'Add method used to create a blank document
Dim objDoc
Set objDoc=objWord.Documents.Add()
'create a shorter variable to pass commands to Word
Dim objSelection
set objSelection=objWord.Selection
'Read one line at a time from the text file and
'type that line into Word until the end of the file is reached
Dim strLine
Do Until objFile.AtEndOfStream
strLine = objFile.ReadLine
objSelection.TypeText strLine
Loop
objFile.Close
希望有所帮助!