我正在使用FreeTextBox.dll获取用户输入,并将该信息以HTML格式存储在数据库中。用户输入的结果如下:
133 Peachtree St NE
Atlanta, GA 30303
404-652-7777Cindy Cooley
www.somecompany.com
Product Stewardship Mgr9/9/2011
Deidre's Company
123 Test St
Atlanta, GA 30303
Test test.
我希望HTMLWorker能够持久保存用户输入的空白区域,但它会将其剥离。有没有办法坚持用户的空白区域?下面是我如何创建PDF文档的示例。
Public Shared Sub CreatePreviewPDF(ByVal vsHTML As String,ByVal vsFileName As String)
Dim output As New MemoryStream()
Dim oDocument As New Document(PageSize.LETTER)
Dim writer As PdfWriter = PdfWriter.GetInstance(oDocument, output)
Dim oFont As New Font(Font.FontFamily.TIMES_ROMAN, 8, Font.NORMAL, BaseColor.BLACK)
Using output
Using writer
Using oDocument
oDocument.Open()
Using sr As New StringReader(vsHTML)
Using worker As New html.simpleparser.HTMLWorker(oDocument)
worker.StartDocument()
worker.SetInsidePRE(True)
worker.Parse(sr)
worker.EndDocument()
worker.Close()
oDocument.Close()
End Using
End Using
HttpContext.Current.Response.ContentType = "application/pdf"
HttpContext.Current.Response.AddHeader("Content-Disposition", String.Format("attachment;filename={0}.pdf", vsFileName))
HttpContext.Current.Response.BinaryWrite(output.ToArray())
HttpContext.Current.Response.End()
End Using
End Using
output.Close()
End Using
End Sub
答案 0 :(得分:1)
iText和iTextSharp有一个小故障,但如果你不介意下载源代码并重新编译它,你可以很容易地修复它。您需要更改两个文件。我所做的任何更改都会在代码中内联注释。行号基于5.1.2.0代码转240
第一个是iTextSharp.text.html.HtmlUtilities.cs
。在第249行查找函数EliminateWhiteSpace
并将其更改为:
public static String EliminateWhiteSpace(String content) {
// multiple spaces are reduced to one,
// newlines are treated as spaces,
// tabs, carriage returns are ignored.
StringBuilder buf = new StringBuilder();
int len = content.Length;
char character;
bool newline = false;
bool space = false;//Detect whether we have written at least one space already
for (int i = 0; i < len; i++) {
switch (character = content[i]) {
case ' ':
if (!newline && !space) {//If we are not at a new line AND ALSO did not just append a space
buf.Append(character);
space = true; //flag that we just wrote a space
}
break;
case '\n':
if (i > 0) {
newline = true;
buf.Append(' ');
}
break;
case '\r':
break;
case '\t':
break;
default:
newline = false;
space = false; //reset flag
buf.Append(character);
break;
}
}
return buf.ToString();
}
第二项更改位于iTextSharp.text.xml.simpleparser.SimpleXMLParser.cs
。在第185行的函数Go
中,将第248行更改为:
if (html /*&& nowhite*/) {//removed the nowhite check from here because that should be handled by the HTML parser later, not the XML parser
答案 1 :(得分:0)
我建议使用wkhtmltopdf代替iText。 wkhtmltopdf将输出完全由webkit(谷歌浏览器,Safari)呈现的html而不是iText的转换。它只是一个你可以调用的二进制文件。话虽这么说,我可能会检查html以确保用户输入中有段落和/或换行符。它们可能会在转换之前被剥离。
答案 2 :(得分:0)
感谢大家的帮助。通过执行以下操作,我能够找到一个小工作:
vsHTML.Replace(" ", " ").Replace(Chr(9), " ").Replace(Chr(160), " ").Replace(vbCrLf, "<br />")
实际代码无法正常显示,但第一个替换是用
替换空格,Chr(9)
替换为
,Chr(160)
替换为
}。