将文本从pdf转换为excel

时间:2019-07-15 06:42:47

标签: c# excel pdf itext

我使用iTextSharp,我想从pdf复制文本并将其粘贴到excel。但是字体不会与文本一起复制。如何使用保存格式进行复制?

    private static void ExportPDF(string fileName) {
        string excelName = @"C:\Users\test2.xlsx";
        XLWorkbook workbook = new XLWorkbook(excelName);
        var ws = workbook.Worksheet(1);
        var step = 0;
        StringBuilder text = new StringBuilder();
        PdfReader pdfReader = new PdfReader(fileName);
        iTextSharp.text.Rectangle rectangle = new iTextSharp.text.Rectangle(150f, 50f, 794f, 690f);
        RenderFilter renderFilter = new RegionTextRenderFilter(rectangle);
        for(var page = 451; page <= 466; page++) {
            ITextExtractionStrategy strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), renderFilter);
            string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
            currentText = Encoding.UTF8.GetString(Encoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.UTF8.GetBytes(currentText)));
            text.Append(currentText);
            step++;
            ws.Cell(step, 1).Value = currentText;
        }
        workbook.Save();
    }

0 个答案:

没有答案