在c#中使用iTextSharp的pdf阿拉伯语

时间:2015-12-30 11:05:16

标签: c# pdf itext arabic arabic-support

我想在C#中创建一个包含阿拉伯文字内容的PDF文件。我正在使用iTextSharp来创建它。我按照http://geekswithblogs.net/JaydPage/archive/2011/11/02/using-itextsharp-to-correctly-display-hebrew--arabic-text-right.aspx中的说明进行操作。我想在pdf中插入以下阿拉伯语句子。

  

تمإبرامهذاالعقدفيههااليوم[●]مالموافق[●]منقبلوبين。

[●]需要用动态英语单词代替。我尝试使用ARIALUNI.TTF实现这一点[本教程链接建议]。代码如下。

public void WriteDocument()
{
    //Declare a itextSharp document 
    Document document = new Document(PageSize.A4);

    //Create our file stream and bind the writer to the document and the stream 
    PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(@"D:\Test.Pdf", FileMode.Create));

    //Open the document for writing 
    document.Open();

    //Add a new page 
    document.NewPage();

    //Reference a Unicode font to be sure that the symbols are present. 
    BaseFont bfArialUniCode = BaseFont.CreateFont(@"D:\ARIALUNI.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
    //Create a font from the base font 
    Font font = new Font(bfArialUniCode, 12);

    //Use a table so that we can set the text direction 
    PdfPTable table = new PdfPTable(1);
    //Ensure that wrapping is on, otherwise Right to Left text will not display 
    table.DefaultCell.NoWrap = false;

    //Create a regex expression to detect hebrew or arabic code points 
    const string regex_match_arabic_hebrew = @"[\u0600-\u06FF,\u0590-\u05FF]+";
    if (Regex.IsMatch("م الموافق", regex_match_arabic_hebrew, RegexOptions.IgnoreCase))
    {
        table.RunDirection = PdfWriter.RUN_DIRECTION_RTL;
    }

    //Create a cell and add text to it 
    PdfPCell text = new PdfPCell(new Phrase(" : "+"من قبل وبين" + " 2007 " + "م الموافق" + " dsdsdsdsds " + "تم إبرام هذا العقد في هذا اليوم ", font));
    //Ensure that wrapping is on, otherwise Right to Left text will not display 
    text.NoWrap = false;

    //Add the cell to the table 
    table.AddCell(text);

    //Add the table to the document 
    document.Add(table);

    //Close the document 
    document.Close();

    //Launch the document if you have a file association set for PDF's 
    Process AcrobatReader = new Process();
    AcrobatReader.StartInfo.FileName = @"D:\Test.Pdf";
    AcrobatReader.Start();
}

在调用此函数时,我得到了一个带有一些Unicode的PDF,如下所示。

  

اذهيفددعلااذهماربإمتdsdsdsdsdsقفاوملام2007نيبولبقنم   مويلا

它与我们的硬编码阿拉伯语句子不匹配。这是字体问题吗?请帮助我或建议我实施相同的任何其他方法。

3 个答案:

答案 0 :(得分:7)

@csharpcoder有正确的想法,但他的执行是关闭的。他没有将单元格添加到表格中,并且表格不会在文档中结束。

void Go()
{
    Document doc = new Document(PageSize.LETTER);
    string yourPath = "foo/bar/baz.pdf";
    using (FileStream os = new FileStream(yourPath, FileMode.Create))
    {
        PdfWriter.GetInstance(doc, os); // you don't need the return value

        doc.Open();

        string fontLoc = @"c:\windows\fonts\arialuni.ttf"; // make sure to have the correct path to the font file
        BaseFont bf = BaseFont.CreateFont(fontLoc, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
        Font f = new Font(bf, 12);

        PdfPTable table = new PdfPTable(1); // a table with 1 cell
        Phrase text = new Phrase("العقد", f);
        PdfPCell cell = new PdfPCell(text);
        table.RunDirection = PdfWriter.RUN_DIRECTION_RTL; // can also be set on the cell
        table.AddCell(cell);
        doc.Add(table);
        doc.Close();
    }
}

你可能想要摆脱细胞边界等,但这些信息可以在SO或iText网站的其他地方找到。 iText应该能够处理包含RTL和LTR字符的文本。

修改

我认为源问题实际上是在Visual Studio和Firefox(我的浏览器)中如何呈现阿拉伯语文本,或者是如何连接字符串。我对阿拉伯语文本编辑器不太熟悉,但如果我们这样做,文本似乎正确:

Arabic text in Visual Studio

仅供参考我必须截取屏幕截图,因为从VS复制粘贴到浏览器中(反之亦然)会弄乱文本部分的顺序。

答案 1 :(得分:4)

只有ColumnText和PdfPTable支持从右到左书写和阿拉伯语连字!

试用以下代码:

    Document Doc = new Document(PageSize.LETTER);

//Create our file stream
using (FileStream fs = new FileStream(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Test.pdf"), FileMode.Create, FileAccess.Write, FileShare.Read))
{
    //Bind PDF writer to document and stream
    PdfWriter writer = PdfWriter.GetInstance(Doc, fs);

    //Open document for writing
    Doc.Open();

    //Add a page
    Doc.NewPage();

    //Full path to the Unicode Arial file
    string ARIALUNI_TFF = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "arabtype.TTF");

    //Create a base font object making sure to specify IDENTITY-H
    BaseFont bf = BaseFont.CreateFont(ARIALUNI_TFF, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
    Font f = new Font(bf, 12);
    //Write some text, the last character is 0x0278 - LATIN SMALL LETTER PHI
    Doc.Add(new Phrase("This is a ميسو ɸ", f));

    //add Arabic text, for instance in a table
    PdfPCell cell = new PdfPCell();
    cell.AddElement(new Phrase("Hello\u0682", f));
    cell.RunDirection = PdfWriter.RUN_DIRECTION_RTL;
    //Close the PDF
    Doc.Close();
}

答案 2 :(得分:1)

我希望这些说明可以帮助您解决其他问题:

  1. 使用安全代码来实现您的字体:

    var tahomaFontFile =  Path.Combine(
        Environment.GetFolderPath(Environment.SpecialFolder.Fonts), 
        "Tahoma.ttf");
    
  2. 使用BaseFont.IDENTITY_HBaseFont.EMBEDDED属性。

    var tahomaBaseFont = BaseFont.CreateFont(tahomaFontFile, 
         BaseFont.IDENTITY_H, 
         BaseFont.EMBEDDED);
    var tahomaFont = new Font(tahomaBaseFont, 8, Font.NORMAL);
    
  3. 对您的单元格和表格使用PdfWriter.RUN_DIRECTION_RTL

    var table = new PdfPTable(1) 
        { 
            RunDirection = PdfWriter.RUN_DIRECTION_RTL 
        };
    
    var phrase = new Phrase("تم إبرام هذا العقد في هذا اليوم [●] م الموافق [●] من قبل وبين .", 
         tahomaFont);
    var cell = new PdfPCell(phrase)
        {
            RunDirection = PdfWriter.RUN_DIRECTION_RTL,
            Border = 0,
        };