使用itextSharp我有抛光字符的问题。我想从html创建pdf。一切都很好,但缺少波兰人的性格。我使用函数lower:
private void createPDF(string html)
{
//MemoryStream msOutput = new MemoryStream();
TextReader reader = new StringReader(html);// step 1: creation of a document-object
Document document = new Document(PageSize.A4, 30, 30, 30, 30);
// step 2:
// we create a writer that listens to the document
// and directs a XML-stream to a file
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream("Test.pdf", FileMode.Create));
// step 3: we create a worker parse the document
HTMLWorker worker = new HTMLWorker(document);
// step 4: we open document and start the worker on the document
document.Open();
worker.StartDocument();
// step 5: parse the html into the document
worker.Parse(reader);
// step 6: close the document and the worker
worker.EndDocument();
worker.Close();
document.Close();
}
尝试使用它:
createPDF( “ĄąćęĘłŁŃńóÓŚśŹźŻż”);
我尝试设置:
BaseFont bf = BaseFont.CreateFont(BaseFont.TIMES_ROMAN,Encoding.UTF8.HeaderName,BaseFont.EMBEDDED);
writer.DirectContent.SetFontAndSize(bf, 16);
但它不起作用
你有什么想法??
此致
答案 0 :(得分:7)
我得到了答案! =)(特别针对抛光)我觉得有必要把它放在这个旧线程中,因为我确定我不会是最后一个找到它。
我对此没有任何好的答案感到非常失望......他们中的大多数建议在Windows FONTS文件夹中使用ARIALUNI.TTF,这会导致您的PDF文件变大很多倍。解决方案不需要如此激烈......
许多其他人建议使用cp1252进行编码的示例在Arial上失败,但在Helvetica中无法用于波兰语文本。
我正在使用iTextSharp 4.1.6 ......技巧是...... cp1257!您可以将它与BaseFont.Courier,BaseFont.Helvetica,BaseFont.Times-Roman
一起使用这有效......我的PDF文件很小(3kb!)
document.Open();
var bigFont = FontFactory.GetFont(BaseFont.COURIER, BaseFont.CP1257, 18, Font.BOLD);
var para = new Paragraph("Oryginał", bigFont);
document.Add(pgDocType);
document.Close();
我将稍后进行测试,并确保除了Windows 7之外,我还可以在Windows XP和Mac OSX中打开和阅读这些内容。
答案 1 :(得分:6)
只是为了汇总@Mark Storer说的话:
private void createPDF(string html)
{
//MemoryStream msOutput = new MemoryStream();
TextReader reader = new StringReader(html);// step 1: creation of a document-object
Document document = new Document(PageSize.A4, 30, 30, 30, 30);
// step 2:
// we create a writer that listens to the document
// and directs a XML-stream to a file
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream("Test.pdf", FileMode.Create));
// step 3: we create a worker parse the document
HTMLWorker worker = new HTMLWorker(document);
// step 4: we open document and start the worker on the document
document.Open();
// step 4.1: register a unicode font and assign it an allias
FontFactory.Register("C:\\Windows\\Fonts\\ARIALUNI.TTF", "arial unicode ms");
// step 4.2: create a style sheet and set the encoding to Identity-H
iTextSharp.text.html.simpleparser.StyleSheet ST = New iTextSharp.text.html.simpleparser.StyleSheet();
ST.LoadTagStyle("body", "encoding", "Identity-H");
// step 4.3: assign the style sheet to the html parser
worker.Style = ST;
worker.StartDocument();
// step 5: parse the html into the document
worker.Parse(reader);
// step 6: close the document and the worker
worker.EndDocument();
worker.Close();
document.Close();
}
当你打电话时,用你在上面注册的名字将文字换成一个字体:
createPDF("<font face=""arial unicode ms"">ĄąćęĘłŁŃńóÓŚśŹźŻż</font>");
答案 2 :(得分:2)
创建BaseFont时,您需要指定要使用UniCode字符。 This answer显示了如何。
答案 3 :(得分:2)
当我浏览各种论坛和stackoverflow问题时,我找不到针对特殊字符问题的复杂解决方案的答案。我试图提供一个以换取相当长的答案。希望这能帮助别人......
我使用了来自SourceForge的XMLWorker
,因为HtmlWorker
已被删除。特殊字符的问题仍然存在。我找到了两种实际有效的解决方案,可以单独使用,也可以组合使用。
所涉及的每个标记都需要指定font-family样式才能通过ParseXHtml
方法正确解释(我不知道为什么嵌套标记样式继承在这里不起作用但它似乎确实没有或者它并不完全)。
此解决方案允许仅基于HTML代码修改生成的PDF,因此可能会发生一些没有代码重新编译的情况。
简化代码(对于MVC应用程序)将是这样的:
public FileStreamResult GetPdf()
{
const string CONTENT_TYPE = "application/pdf"
var fileName = "mySimple.pdf";
var html = GetViewPageHtmlCode();
//the way how to capture view HTML are described in other threads, e.g. [here][2]
var css = Server.MapPath("~/Content/Pdf.css");
using (var capturedActionStream = new MemoryStream(USED_ENCODING.GetBytes(html)))
{
using (var cssFile = new FileStream(css), FileMode.Open))
{
var memoryStream = new MemoryStream();
//to create landscape, use PageSize.A4.Rotate() for pageSize
var document = new Document(PageSize.A4, 30, 30, 10, 10);
var writer = PdfWriter.GetInstance(document, memoryStream);
var worker = XMLWorkerHelper.GetInstance();
document.Open();
worker.ParseXHtml(writer, document, capturedActionStream, cssFile);
writer.CloseStream = false;
document.Close();
memoryStream.Position = 0;
//to enforce file download
HttpContext.Response.AddHeader(
"Content-Disposition",
String.Format("attachment; filename={0}", fileName));
var wrappedPdf = new FileStreamResult(memoryStream, CONTENT_TYPE);
return wrappedPdf;
}
}
}
body {
background-color: white;
font-size: .85em;
font-family: Arial;
margin: 0;
padding: 0;
color: black;
}
p, ul {
margin-bottom: 20px;
line-height: 1.6em;
}
div, span {
font-family: Arial;
}
h1, h2, h3, h4, h5, h6 {
font-size: 1.5em;
color: #000;
font-family: Arial;
}
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
<title>@ViewBag.Title</title>
<link href="@Url.Content("~/Content/Pdf.css")" rel="stylesheet" type="text/css" />
</head>
<body>
<div class="page">
<div id="main">
@RenderBody()
</div>
</div>
</body>
</html>
@{
ViewBag.Title = "PDF page title"
}
<h1>@ViewBag.Title</h1>
<p>
ěščřžýáíéů ĚŠČŘŽÝÁÍÉŮ
</p>
在此解决方案中,IFontProvider
返回的字体被修改为包含(正确)特殊字符表示的字体,并使用BaseFont.IDENTITY_H编码。该方法的优点是,只使用了一种字体。这也是排序的缺点。
此外,此解决方案还希望字体是项目的一部分(放置在Content/Fonts
文件夹中的* .ttf文件)。
或者可以从Windows字体位置检索字体:Environment.GetFolderPath(Environment.SpecialFolder.Fonts)
- 这需要知道(或坚信)服务器上安装的字体或控制服务器
FontProvider
(超过FontFactory
)我冒昧地扩展了Gregor S's solution,这提供了更复杂的FontFactory,可以用于通过XMLWorker推送的各种HTML“模板”。
public class CustomFontFactory : FontFactoryImp
{
public const Single DEFAULT_FONT_SIZE = 12;
public const Int32 DEFAULT_FONT_STYLE = 0;
public static readonly BaseColor DEFAULT_FONT_COLOR = BaseColor.BLACK;
public String DefaultFontPath { get; private set; }
public String DefaultFontEncoding { get; private set; }
public Boolean DefaultFontEmbedding { get; private set; }
public Single DefaultFontSize { get; private set; }
public Int32 DefaultFontStyle { get; private set; }
public BaseColor DefaultFontColor { get; private set; }
public Boolean ReplaceEncodingWithDefault { get; set; }
public Boolean ReplaceEmbeddingWithDefault { get; set; }
public Boolean ReplaceFontWithDefault { get; set; }
public Boolean ReplaceSizeWithDefault { get; set; }
public Boolean ReplaceStyleWithDefault { get; set; }
public Boolean ReplaceColorWithDefault { get; set; }
public BaseFont DefaultBaseFont { get; protected set; }
public CustomFontFactory(
String defaultFontFilePath,
String defaultFontEncoding = BaseFont.IDENTITY_H,
Boolean defaultFontEmbedding = BaseFont.EMBEDDED,
Single? defaultFontSize = null,
Int32? defaultFontStyle = null,
BaseColor defaultFontColor = null,
Boolean automaticalySetReplacementForNullables = true)
{
//set default font properties
DefaultFontPath = defaultFontFilePath;
DefaultFontEncoding = defaultFontEncoding;
DefaultFontEmbedding = defaultFontEmbedding;
DefaultFontColor = defaultFontColor == null
? DEFAULT_FONT_COLOR
: defaultFontColor;
DefaultFontSize = defaultFontSize.HasValue
? defaultFontSize.Value
: DEFAULT_FONT_SIZE;
DefaultFontStyle = defaultFontStyle.HasValue
? defaultFontStyle.Value
: DEFAULT_FONT_STYLE;
//set default replacement options
ReplaceFontWithDefault = false;
ReplaceEncodingWithDefault = true;
ReplaceEmbeddingWithDefault = false;
if (automaticalySetReplacementForNullables)
{
ReplaceSizeWithDefault = defaultFontSize.HasValue;
ReplaceStyleWithDefault = defaultFontStyle.HasValue;
ReplaceColorWithDefault = defaultFontColor != null;
}
//define default font
DefaultBaseFont = BaseFont.CreateFont(DefaultFontPath, DefaultFontEncoding, DefaultFontEmbedding);
//register system fonts
FontFactory.RegisterDirectories();
}
protected Font GetBaseFont(Single size, Int32 style, BaseColor color)
{
var baseFont = new Font(DefaultBaseFont, size, style, color);
return baseFont;
}
public override Font GetFont(String fontname, String encoding, Boolean embedded, Single size, Int32 style, BaseColor color, Boolean cached)
{
//eventually replace expected font properties
size = ReplaceSizeWithDefault
? DefaultFontSize
: size;
style = ReplaceStyleWithDefault
? DefaultFontStyle
: style;
encoding = ReplaceEncodingWithDefault
? DefaultFontEncoding
: encoding;
embedded = ReplaceEmbeddingWithDefault
? DefaultFontEmbedding
: embedded;
//get font
Font font = null;
if (ReplaceFontWithDefault)
{
font = GetBaseFont(
size,
style,
color);
}
else
{
font = FontFactory.GetFont(
fontname,
encoding,
embedded,
size,
style,
color,
cached);
if (font.BaseFont == null)
font = GetBaseFont(
size,
style,
color);
}
return font;
}
}
private const String DEFAULT_FONT_LOCATION = "~/Content/Fonts";
private const String DEFAULT_FONT_NAME = "arialn.ttf";
public FileStreamResult GetPdf()
{
const string CONTENT_TYPE = "application/pdf"
var fileName = "mySimple.pdf";
var html = GetViewPageHtmlCode();
//the way how to capture view HTML are described in other threads, e.g.
var css = Server.MapPath("~/Content/Pdf.css");
using (var capturedActionStream = new MemoryStream(USED_ENCODING.GetBytes(html)))
{
using (var cssFile = new FileStream(css), FileMode.Open))
{
var memoryStream = new MemoryStream();
var document = new Document(PageSize.A4, 30, 30, 10, 10);
//to create landscape, use PageSize.A4.Rotate() for pageSize
var writer = PdfWriter.GetInstance(document, memoryStream);
var worker = XMLWorkerHelper.GetInstance();
var defaultFontPath = Server
.MapPath(Path
.Combine(
DEFAULT_FONT_LOCATION,
DEFAULT_FONT_NAME));
var fontProvider = new CustomFontFactory(defaultFontPath);
document.Open();
worker.ParseXHtml(writer, document, capturedActionStream, cssFile, fontProvider);
writer.CloseStream = false;
document.Close();
memoryStream.Position = 0;
//to enforce file download
HttpContext.Response.AddHeader(
"Content-Disposition",
String.Format("attachment; filename={0}", fileName));
var wrappedPdf = new FileStreamResult(memoryStream, CONTENT_TYPE);
return wrappedPdf;
}
}
}
body {
background-color: white;
font-size: .85em;
font-family: "Trebuchet MS", Verdana, Helvetica, Sans-Serif;
margin: 0;
padding: 0;
color: black;
}
p, ul {
margin-bottom: 20px;
line-height: 1.6em;
}
h1, h2, h3, h4, h5, h6 {
font-size: 1.5em;
color: #000;
}
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
<title>@ViewBag.Title</title>
<link href="@Url.Content("~/Content/Pdf.css")" rel="stylesheet" type="text/css" />
</head>
<body>
<div class="page">
<div id="main">
@RenderBody()
</div>
</div>
</body>
</html>
@{
ViewBag.Title = "PDF page title"
}
<h1>@ViewBag.Title</h1>
<p>
ěščřžýáíéů ĚŠČŘŽÝÁÍÉŮ
</p>
其他有用(重新)来源:
答案 4 :(得分:1)
1)iText 5.0.6今天发布,对HTML-&gt; PDF转换代码进行了重大改革。我建议您尝试使用新代码。
2)我几乎肯定的是,像这样设置directContent不会影响HTMLWorker生成的pdf内容。我99%肯定它会在绘制任何文字之前重新设置字体。
3)尝试将您的字符串包装在<font face="AFontThatActuallyContainsThoseCharacters">
标记中。我严重怀疑HTMLWorker选择的默认字体是否适合这项工作。
不。默认值为Helvetica和WinAnsiEncoding。绝对不适合典型的英语/德语/法语/西班牙语之外的任何东西。
您应该可以使用HTMLWorker.setStyleSheet
设置一些更友好的默认值。您需要将“面部”和“编码”设置为更加波兰友好的内容。我建议使用“Identity-H”进行编码,这样可以访问所用字体中的所有字符,无论语言如何。对于字体,自WayBack以来在Windows中有一个名为“charmap.exe”的程序,它将显示给定编码(包括unicode)中字体可用的字符。 “Arial”家族和其他几个家庭一样好看。
“新代码”可能不会改变您所看到的任何行为。这是一个重构,使未来(我理解的下一个版本)变得更容易。
我的建议是使用setStyleSheet()
:
// step 3: we create a worker parse the document
HTMLWorker worker = new HTMLWorker(document);
StyleSheet sheet = new StyleSheet;
HashMap<String, String> styleMap = new HashMap<String, String>();
styleMap.put("face", "Arial"); // default font
styleMap.put("encoding", "Identity-H"); // default encoding
String tags[] = {"p", "div", ...};
for (String tag : tags) {
sheet.applyStyle( tag, styleMap );
}
我不确定,但你可能只能applyStyle("body", styleMap)
并将其级联到它包含的所有内容中,但我不确定。我也不确定这会解决你的1-line-test,因为没有涉及标签。 IIRC,如果没有身体标签,我们会建立身体标签,但我完全不确定。