Question

如何通过C＃编程将表内容仅导出到excel文件？我目前正在使用PDFNET SDK从PDF中提取所有内容，但无法将表格作为表格结构读取

Answer 1

我知道该产品没有使用SDK，但我使用的是独立产品。它将PDF的内容读入电子表格（许多导出选项）。

产品是Nuance http://australia.nuance.com/for-business/by-product/omnipage/index.htm的OmniPage。

有一个免费评估的SDK。

Answer 2

我尝试了上述解决方案，但无法实现。有很多免费的sdk或dll可用，如pdfnet，pdfclown，itextsharp，pdfbox，pdflib。最后再次尝试使用pdfnet sdk，现在我能够要做到这一点，如果我的输入pdf是标记为pdf的类型。

Answer 3

使用bytescount PDF Extractor SDK我们可以提取整个页面，如下所示

        CSVExtractor extractor = new CSVExtractor();
        extractor.RegistrationName = "demo";
        extractor.RegistrationKey = "demo";

        TableDetector tdetector = new TableDetector();
        tdetector.RegistrationKey = "demo";
        tdetector.RegistrationName = "demo";

            // Load the document
        extractor.LoadDocumentFromFile("C:\\sample.pdf");
        tdetector.LoadDocumentFromFile("C:\\sample.pdf");

           int pageCount = tdetector.GetPageCount();

            for (int i = 1; i <= pageCount; i++)
            {
                int j = 1;

                    do
                    {
                            extractor.SetExtractionArea(tdetector.GetPageRect_Left(i),
                            tdetector.GetPageRect_Top(i),
                            tdetector.GetPageRect_Width(i),
                            tdetector.GetPageRect_Height(i)
                        );

                        // and finally save the table into CSV file
                        extractor.SavePageCSVToFile(i, "C:\\page-" + i + "-table-" + j + ".csv");
                        j++;
                    } while (tdetector.FindNextTable()); // search next table
            }

因为它是一个老帖子，希望它能帮助别人。

Answer 4

上面的答案（约翰）有效，它确实很有用。

但我使用bytescount PDF Extrator SDK工具而不是使用代码。

顺便说一句，该工具将在一个excel文件中生成大量工作表。

您可以使用excel中的以下代码生成一张表。

Sub ConvertAsOne()
Application.ScreenUpdating = False
For j = 1 To Sheets.Count
If Sheets(j).Name <> ActiveSheet.Name Then
   X = Range("A65536").End(xlUp).Row + 1
   Sheets(j).UsedRange.Copy Cells(X, 1)
End If
Next
Range("B1").Select
Application.ScreenUpdating = True
MsgBox "succeed！", vbInformation, "note"
End Sub

将表从pdf导出到excel？

4 个答案: