我正在使用iTextSharp从图像生成pdf-a文档。到目前为止,我还没有成功 编辑:我正在使用iTextSharp生成PDF
我尝试的只是制作一个pdf文件(1a或1b,无论适合什么),带有一些图像。这是我到目前为止提出的代码,但在尝试使用pdf-tools或validatepdfa验证时,我一直收到错误。
这是我从pdf-tools获得的错误(使用PDF / A-1b验证): 编辑:MarkInfo和Color Space尚未运行。其余的都没关系
Validating file "0.pdf" for conformance level pdfa-1a
The key MarkInfo is required but missing.
A device-specific color space (DeviceRGB) without an appropriate output intent is used.
The document does not conform to the requested standard.
The document contains device-specific color spaces.
The document doesn't provide appropriate logical structure information.
Done.
主要流程
var output = new MemoryStream();
using (var iccProfileStream = new FileStream("ToPdfConverter/ColorProfiles/sRGB_v4_ICC_preference_displayclass.icc", FileMode.Open))
{
var document = new Document(new Rectangle(PageSize.A4.Width, PageSize.A4.Height), 0f, 0f, 0f, 0f);
var pdfWriter = PdfWriter.GetInstance(document, output);
pdfWriter.PDFXConformance = PdfWriter.PDFA1A;
document.Open();
var pdfDictionary = new PdfDictionary(PdfName.OUTPUTINTENT);
pdfDictionary.Put(PdfName.OUTPUTCONDITION, new PdfString("sRGB IEC61966-2.1"));
pdfDictionary.Put(PdfName.INFO, new PdfString("sRGB IEC61966-2.1"));
pdfDictionary.Put(PdfName.S, PdfName.GTS_PDFA1);
var iccProfile = ICC_Profile.GetInstance(iccProfileStream);
var pdfIccBased = new PdfICCBased(iccProfile);
pdfIccBased.Remove(PdfName.ALTERNATE);
pdfDictionary.Put(PdfName.DESTOUTPUTPROFILE, pdfWriter.AddToBody(pdfIccBased).IndirectReference);
pdfWriter.ExtraCatalog.Put(PdfName.OUTPUTINTENT, new PdfArray(pdfDictionary));
var image = PrepareImage(imageBytes);
document.Open();
document.Add(image);
pdfWriter.CreateXmpMetadata();
pdfWriter.CloseStream = false;
document.Close();
}
return output.GetBuffer();
这是prepareImage()
它用于将图像压平为bmp,因此我不需要打扰alpha通道。
private Image PrepareImage(Stream stream)
{
Bitmap bmp = new Bitmap(System.Drawing.Image.FromStream(stream));
var file = new MemoryStream();
bmp.Save(file, ImageFormat.Bmp);
var image = Image.GetInstance(file.GetBuffer());
if (image.Height > PageSize.A4.Height || image.Width > PageSize.A4.Width)
{
image.ScaleToFit(PageSize.A4.Width, PageSize.A4.Height);
}
return image;
}
任何人都可以帮助我找到修复错误的方向吗?
特别是device-specific color spaces
编辑:更多解释:我想要实现的目标是将扫描图像转换为PDF / A以进行长期数据存储
编辑:添加了一些我用来测试的文件
PDF和Pictures.rar(3.9 MB)
https://mega.co.nz/#!n8pClYgL!NJOJqSO3EuVrqLVyh3c43yW-u_U35NqeB0svc6giaSQ
答案 0 :(得分:1)
好的,我在callas pdfToolbox中检查了你的一个文件,它说:“使用了设备颜色空间但没有PDF / A输出意图”。在编写输出意图到文档时,我认为你做错了什么。然后我使用相同的工具将该文档转换为PDF / A-1b,差别很明显。
可能还有其他错误需要修复,但这里的第一个错误是您在名为“OutputIntent”的PDF文件的目录字典中放了一个键。这是错误的:PDF规范的第75页声明密钥应命名为“OutputIntents”。
就像我说的那样,你的文件可能存在其他问题,但是密钥的名称错误导致PDF / A验证器找不到你试图放入文件的输出意图......
答案 1 :(得分:0)
首先,pdfx不是pdfa。
我不幸遇到图像问题的解决方案,但我有1和2。
此致
using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Text;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.html.simpleparser;
using iTextSharp.tool.xml;
using System.Drawing;
using System.Drawing.Imaging;
namespace Tests
{
/*
* References:
* UTF-8 encoding http://stackoverflow.com/questions/4902033/itextsharp-5-polish-character
* PDFA http://www.codeproject.com/Questions/661704/Create-pdf-A-using-itextsharp
* Images http://stackoverflow.com/questions/15896581/make-a-pdf-conforming-pdf-a-with-only-images-using-itextsharp
*/
[TestClass]
public class UnitTest1
{
/*
* IMPORTANT: Restrictions with html usage of tags and attributes
* 1. Dont use * <head> <title>Sklep</title> </head>, because title is rendered to the page
*/
// Test cases
static string contents = "<html><body style=\"font-family:arial unicode ms;font-size: 8px;\"><p style=\"text-align: center;\"> Davčna številka dolžnika: 74605968<br /> </p><table> <tr> <td><b>\u0160t. sklepa: 88711501</b></td> <td style=\"text-align: right;\">Davčna številka dolžnika: 74605968</td> </tr> </table> <br/><img src=\"http://img.rtvslo.si/_static/images/rtvslo_mmc_logo.png\" /></body></html>";
//static string contents = "<html><body style=\"font-family:arial unicode ms;font-size: 8px;\"><p style=\"text-align: center;\"> Davčna številka dolžnika: 74605968<br /> </p><table> <tr> <td><b>\u0160t. sklepa: 88711501</b></td> <td style=\"text-align: right;\">Davčna številka dolžnika: 74605968</td> </tr> </table> <br/></body></html>";
//[TestMethod]
public void CreatePdfHtml()
{
createPDF(contents, true);
}
private void createPDF(string html, bool isPdfa)
{
TextReader reader = new StringReader(html);
Document document = new Document(PageSize.A4, 30, 30, 30, 30);
HTMLWorker worker = new HTMLWorker(document);
PdfWriter writer;
if (isPdfa)
{
//set conformity level
writer = PdfAWriter.GetInstance(document, new FileStream(@"c:\temp\testA.pdf", FileMode.Create), PdfAConformanceLevel.PDF_A_1B);
//set pdf version
writer.SetPdfVersion(PdfAWriter.PDF_VERSION_1_4);
// Create XMP metadata. It's a PDF/A requirement.
writer.CreateXmpMetadata();
}
else
{
writer = PdfWriter.GetInstance(document, new FileStream(@"c:\temp\test.pdf", FileMode.Create));
}
document.Open();
if (isPdfa) // document should be opend, or it will fail
{
// Set output intent for uncalibrated color space. PDF/A requirement.
ICC_Profile icc = ICC_Profile.GetInstance(Environment.GetEnvironmentVariable("SystemRoot") + @"\System32\spool\drivers\color\sRGB Color Space Profile.icm");
writer.SetOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
}
//register font used in html
FontFactory.Register(Environment.GetEnvironmentVariable("SystemRoot") + "\\Fonts\\ARIALUNI.TTF", "arial unicode ms");
//adding custom style attributes to html specific tasks. Can be used instead of css
//this one is a must fopr display of utf8 language specific characters (čćžđpš)
iTextSharp.text.html.simpleparser.StyleSheet ST = new iTextSharp.text.html.simpleparser.StyleSheet();
ST.LoadTagStyle("body", "encoding", "Identity-H");
worker.SetStyleSheet(ST);
worker.StartDocument();
worker.Parse(reader);
worker.EndDocument();
worker.Close();
document.Close();
}
}
}