我正在尝试使用VB.NET和iTextSharp将网页转换为PDF。我尝试了很多不同的例子,似乎没有任何工作(至少对我而言)。这是我得到的最接近的,但无论解析哪个网页,XMLWorkerHelper都会抛出异常Invalid nested tag head found, expected closing tag script
。
Dim webClient As New System.Net.WebClient
Dim result As String = webClient.DownloadString("http://google.com")
Dim doc As New Document(PageSize.A4)
Dim writer As PdfWriter = PdfWriter.GetInstance(doc, New System.IO.FileStream("c:\test.pdf", System.IO.FileMode.Create))
Dim sr As New System.IO.StringReader(result)
Try
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, sr)
Catch ex As Exception
End Try
我正在使用iTextSharp 5.4.2.0和.NET Framework 4。
由于
答案 0 :(得分:1)
使用
var HtmlNode = new HtmlAgilityPack.HtmlDocument();
HtmlNode.OptionOutputAsXml = true;
HtmlNode.Save(your directory); // save a new HTML file converted
然后从新HTML转换。