I want to cut out all pages of a PDF file that contain a special string (splittag). Until now I have this code but it just gives out all pages of the source PDF. So whats wrong with it? I iterate trough the Pages of the source PDF and check if the actual page contains the splittag, then create a new PDF using it for pagenumber. Would be great if someone could help. Thank you!
iTextSharp.text.PdfReader reader = new iTextSharp.text.PdfReader(textBox3.Text);
string splittag = textBox2.Text;
StringBuilder text = new StringBuilder();
for (int i = 1; i <= reader.NumberOfPages; i++)
{
if(PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy()).ToString().Contains(splittag)) ;
{
richTextBox1.Text = PdfTextExtractor.GetTextFromPage(reader, i, new SimpleTextExtractionStrategy());
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileStream(textBox5.Text + "\\" + i + ".pdf", FileMode.Create));
document.Open();
copy.AddPage(copy.GetImportedPage(reader, i));
document.Close();
}
}
答案 0 :(得分:4)
我会使用以下代码:
public List<PdfPage> PdfDocument::copyPagesTo(int pageFrom,
int pageTo,
PdfDocument toDocument,
IPdfPageExtraCopier copier)
这会生成需要包含的页面列表。 然后,您可以使用iText代码将您想要的页面分开
NSCoding protocol
答案 1 :(得分:0)
我现在在这里使用此代码。工作正常,更容易。
FileInfo file = new FileInfo(textBox2.Text);
using (PdfReader reader = new PdfReader(textBox2.Text))
{
for (int pagenumber = 1; pagenumber <= reader.NumberOfPages; pagenumber++)
{
string filename = System.IO.Path.GetFileNameWithoutExtension(file.Name);
Document document = new Document();
if(PdfTextExtractor.GetTextFromPage(reader, pagenumber, new SimpleTextExtractionStrategy()).Contains("LoremIpsum"))
{
PdfCopy copy = new PdfCopy(document, new FileStream(textBox3.Text + "\\" + filename + pagenumber + ".pdf", FileMode.Create));
document.Open();
copy.AddPage(copy.GetImportedPage(reader, pagenumber));
document.Close();
}
}
}