我有大量的pdf文件,我希望将文件中的所有数据复制到电子表格中的列。
以下是我的代码。它只是打开pdf,使用control-a,然后使用control-c进行复制,然后激活工作簿,找到一个打开的列并使用control-v Sendkey粘贴数据。它工作正常,但它只粘贴最后一个文件中的lastdata(我有一个带有路径名的范围,它打开并复制所有数据,但实际上只粘贴最后一个)。
Sub StartAdobe1()
Dim AdobeApp As String
Dim AdobeFile As String
Dim StartAdobe
Dim fname As Variant
Dim iRow As Integer
Dim Filename As String
For Each fname In Range("path")
AdobeApp = "C:\Program Files (x86)\Adobe\Reader 10.0\Reader\AcroRd32.exe"
StartAdobe = Shell("" & AdobeApp & " " & fname & "", 1)
Application.Wait Now + TimeValue("00:00:01")
SendKeys "^a", True
Application.Wait Now + TimeValue("00:00:01")
SendKeys "^c"
Application.Wait Now + TimeValue("00:00:01")
SendKeys ("%{F4}")
Windows("transfer (Autosaved).xlsm").Activate
Worksheets("new").Activate
ActiveSheet.Range("A1").Select
Selection.End(xlToRight).Offset(0, 1).Select
SendKeys "^v"
Application.Wait Now + TimeValue("00:00:2")
Next fname
答案 0 :(得分:2)
Jeanno是对的,如果你有Acrobat,那么使用它的API库直接使用该文件比解决方法要好得多。我每天都使用它来将pdf文件转换为数据库条目。
您的代码存在一些问题,但我怀疑最大的问题是使用SendKeys "^v"
粘贴到Excel中。您最好选择所需的单元格,然后使用Selection.Paste
。或者甚至更好,将剪贴板的内容传输到变量,然后在写入电子表格之前根据需要在后端解析 - 但这增加了一堆复杂性,在这种情况下对你没有多大帮助。 / p>
要使用以下代码,请务必在工具>参考文献下选择“Acrobat x.x类型库”。
Sub StartAdobe1()
Dim fName As Variant
Dim wbTransfer As Excel.Workbook
Dim wsNew As Excel.Worksheet
Dim dOpenCol As Double
Dim oPDFApp As AcroApp
Dim oAVDoc As AcroAVDoc
Dim oPDDoc As AcroPDDoc
'Define your spreadsheet
Set wbTransfer = Workbooks("transfer (Autosaved).xlsm")
Set wsNew = wbTransfer.Sheets("new")
'Find first open column
dOpenCol = ws.Cells(1, columns.count).End(xlToleft).Column + 1
'Instantiate Acrobat Objects
Set oPDFApp = CreateObject("AcroExch.App")
Set oAVDoc = CreateObject("AcroExch.AVDoc")
Set oPDDoc = CreateObject("AcroExch.PDDoc")
For Each fName In Range("path")
'Open the PDF file. The AcroAVDoc.Open function returns a true/false
'to tell you if it worked
If oAVDoc.Open(fName.Text, "") = True Then
Set oPDDoc = oAVDoc.GetPDDoc
Else
Debug.Assert False
End If
'Copy all using Acrobat menu
oPDFApp.MenuItemExecute ("SelectAll")
oPDFApp.MenuItemExecute ("Copy")
'Paste into open column
wbTransfer.Activate
wsNew.Cells(1, dOpenCol).Select
ActiveSheet.Paste
'Select next open column
dOpenCol = dOpenCol + 1
oAVDoc.Close (1) '(1)=Do not save changes
oPDDoc.Close
Next
'Clean up
Set wbTransfer = Nothing
Set wsNew = Nothing
Set oPDFApp = Nothing
Set oAVDoc = Nothing
Set oPDDoc = Nothing
End Sub
注意:
1 - 还有一个菜单项oPDFApp.MenuItemExecute ("CopyFileToClipboard")
应该选择全部并一步复制,但我遇到了问题所以我坚持上面的两步法。
2-pdf文件由两个对象oAVDoc
和oPDDoc
组成。文件的不同方面由每个控制。在这种情况下,您可能只需要oAVDoc
。尝试评论处理oPDDoc
的行,看看它是否有效。
答案 1 :(得分:0)
我无法让你的代码工作,但我的猜测是它复制了所有数据,但每次都通过循环覆盖它。要解决此问题,请尝试:
ActiveSheet.Cells(1, ActiveSheet.Cells(1, Columns.Count).End(xlToLeft).Column + 1).Select
而不是开始activesheet.range(“A1”)的两行。选择和Selection.End ....
答案 2 :(得分:0)
尝试此代码可能会有效:
Sub Shell_Copy_Paste()
Dim o As Variant
Dim wkSheet As Worksheet
Set wkSheet = ActiveSheet
o = Shell("C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe C:\Users\saurabh.ad.sharma\Desktop\red.pdf", vbNormalFocus)
Application.Wait (Now + TimeSerial(0, 0, 2)) 'Wait for Acrobat to load
SendKeys "^a" 'Select All
SendKeys "^c" 'Copy
SendKeys "%{F4}" 'Close shell application
wkSheet.Range("B5").Select
SendKeys "^v" 'Paste
End Sub
答案 3 :(得分:0)
以下代码将复制PDF格式的数据将它按字样粘贴,然后将文字复制,然后将其粘贴到EXCEL上。
现在为什么我要将数据从pdf复制到word&然后从word复制并将其粘贴到excel,因为我想要将pdf中的数据精确格式化为我的Excel表格,如果我直接从pdf复制到excel,它会将pdf中的整个数据粘贴到单个单元格中即使我是有两列或多行,它会将我的所有数据粘贴到一个列中,单个单元格中也是如此,但如果我从word复制到excel,它将保留其原始格式,并且两列将仅在excel中粘贴为两列。 / p>
Private Sub CommandButton3_Click() '(load pdf)
Dim o As Variant
Set appWord = CreateObject("Word.Application")
o = Shell("C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe C:\Users\saurabh.ad.sharma\Desktop\Book1.pdf", vbNormalFocus) 'loading adobe reader & pdf file from their location
Application.Wait (Now + TimeSerial(0, 0, 2))
SendKeys ("^a")
SendKeys ("^c")
SendKeys "%{F4}"
Application.Wait Now + TimeValue("00:00:01")
Set appWord = CreateObject("Word.Application")
appWord.Visible = True
appWord.Documents.Add.Content.Paste
With appWord
.ActiveDocument.SaveAs Filename:=ThisWorkbook.Path & "\pdf" & ".docx", FileFormat:=wdocument 'saving word file in docx format
.ActiveWindow.Close
.Quit
End With
MsgBox " pdf is loaded "
MsgBox " Paste to EXCEL "
Set appWord = CreateObject("Word.Application")
appWord.Visible = True
appWord.Documents.Open "C:\Users\saurabh.ad.sharma\Desktop\pdf.docx" 'opening word document
appWord.Selection.WholeStory
appWord.Selection.Copy
Set wkSheet = ActiveSheet
wkSheet.Range("A1").Select
wkSheet.Paste 'pasting to the excel file
End Sub
答案 4 :(得分:0)
这是我上面代码的修改版本,它不会保存任何文件,它会将数据保存在剪贴板中并快速执行..
Private Sub CommandButton3_Click() '(load pdf)
Dim o As Variant
Set appWord = CreateObject("Word.Application")
o = Shell("C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe C:\Users\saurabh.ad.sharma\Desktop\Book1.pdf2", vbNormalFocus)
Application.Wait (Now + TimeSerial(0, 0, 2))
SendKeys ("^a")
SendKeys ("^c")
SendKeys "%{F4}"
Application.Wait Now + TimeValue("00:00:01")
Set appWord = CreateObject("Word.Application")
appWord.Visible = False
appWord.Documents.Add.Content.Paste
With appWord
.Selection.WholeStory
.Selection.Copy
.ActiveDocument.Close SaveChanges:=wdDoNotSaveChanges
.Quit
End With
MsgBox " pdf is loaded "
MsgBox " Paste to EXCEL "
Set wkSheet = ActiveSheet
wkSheet.Range("A1").Select
wkSheet.Paste
End Sub
答案 5 :(得分:0)
我有类似的问题。如前所述,最好的解决方案是使用Adobe API。在我的情况下,这是不可能的,因为宏适用于100多个PC上没有安装Adobe Pro的用户。
使用sendkeys的问题是您必须等到PDF文件被加载并且加载时间容易改变。小文件可能需要1秒钟,从共享驱动器打开大文件可能需要8秒钟。
请在下面找到我的解决方案,该解决方案经过优化,可与任何大小的文件一起使用。很稳定我发现只有在Adobe显示打开消息时它才会失败(例如“更新可用”)。
Private Declare PtrSafe Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As LongPtr) 'Initialize sleep function via Windows API
Public Sub CopyToSheet(destinationSheet As Worksheet, pathToPdf as String)
'Copy data from PDF to worksheet
'Initialize timer
Dim StartTime As Double
StartTime = Timer
'Clear clipboard
Dim myData As DataObject
Set myData = New DataObject
myData.SetText text:=Empty
myData.PutInClipboard
Set myData = Nothing
'Build file paths
Dim pathToAdobe As String
pathToAdobe = """C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe"""
pathToPdf = """" & pathToPdf & """"
'Open PDF and wait untill it is open. If file is already opened it will be just activated
Dim pdfId As Long
pdfId = Shell(pathToAdobe & " " & pathToPdf, vbMaximizedFocus)
Do
Sleep (500)
If Round(Timer - StartTime, 2) > 9 Then MsgBox "Failed to open PDF": Exit Sub 'Safety check
Loop Until Me.IsPdfOpen(pathToPdf)
'Copy and wait until copying is completed
SendKeys "^a"
SendKeys "^c"
Do
Sleep (500)
If Round(Timer - StartTime, 2) > 18 Then MsgBox "Failed to copy data to clipboard": Exit Sub 'Safety check
Loop Until Me.GetClipboardStatus = "ClipboardHasData"
'Paste data into worksheet
destinationSheet.Activate
destinationSheet.Range("A1").Select
destinationSheet.Paste
'Close pdf
Call Shell("TaskKill /F /PID " & CStr(pdfId), vbHide)
'Clear clipboard
Set myData = New DataObject
myData.SetText text:=Empty
myData.PutInClipboard
Set myData = Nothing
End Sub
Function IsPdfOpen(pathToPdf) As Boolean
'Check if PDF is already opened
'Build window name (window name is name of the application on Windows task bar)
Dim windowName As String
windowName = pathToPdf
windowName = Mid(windowName, InStrRev(windowName, "\") + 1, Len(windowName) - InStrRev(windowName, "\") + 1)
windowName = windowName + " - Adobe Acrobat Reader DC"
'Try to activate application to check if is opened
On Error Resume Next
AppActivate windowName, True
Select Case Err.Number
Case 5: IsPdfOpen = False
Case 0: IsPdfOpen = True
Case Else: Debug.Assert False
End Select
On Error GoTo 0
End Function
Function GetClipboardStatus() As String
'Check if copying data to clipboard is completed
Dim tempString As String
Dim myData As DataObject
'Try to put data from clipboard to string to check if operations on clipboard are completed
On Error Resume Next
Set myData = New DataObject
myData.GetFromClipboard
tempString = myData.GetText(1)
If Err.Number = 0 Then
If tempString = "" Then
GetClipboardStatus = "ClipboardEmpty"
Else
GetClipboardStatus = "ClipboardHasData"
End If
Else
GetClipboardStatus = "ClipboardBusy"
End If
On Error GoTo 0
Set myData = Nothing
End Function