将rtf formate转换为纯文本

时间:2018-04-28 04:29:54

标签: java c# android wpf

我从API(WPF,C#)收到RTF格式文本。我想将其转换为纯文本并在Textview中显示。我尝试了所有可能的方法,但没有取得成功。

  

rtf文字:   0x7B5C727466315C616E73695C616E7369637067313235325C64656666305C6465666C616E67313033337B5C666F6E7474626C7B5C66305C666E696C5C666368617273657430204D6963726F736F66742053616E732053657269663B7D7D5C766965776B696E64345C7563315C706172645C66305C667331382068656C6C6F5C70617220207D

在API方面,我们可以将它转换为下面的代码。但我想将它转换为android端

ASCIIEncoding ascii = new ASCIIEncoding();
    Archonix.Controls.RichTextBox rtbnote;
                                byte[] note;
                                rtbnote = new Controls.RichTextBox();
                                note = (dr["AlertNote"]) as byte[];
                                rtbnote.Rtf = ascii.GetString(note);
                                string tempRtfText = rtbnote.Rtf;
                                string tempRegularText = rtbnote.Text;
                                objAlertList.AlertNote = tempRegularText;

2 个答案:

答案 0 :(得分:2)

那不是RTF。这是字节的十六进制编码。

字节是文本(RTF 文本),因此您必须首先解码十六进制字符串:

public static void main(String[] args) {
    String hex = "0x7B5C727466315C616E73695C616E7369637067313235325C6465" +
                 "6666305C6465666C616E67313033337B5C666F6E7474626C7B5C66" +
                 "305C666E696C5C666368617273657430204D6963726F736F667420" +
                 "53616E732053657269663B7D7D5C766965776B696E64345C756331" +
                 "5C706172645C66305C667331382068656C6C6F5C70617220207D";
    byte[] bytes = hexStringToByteArray(hex.substring(2));
    String text = new String(bytes, StandardCharsets.US_ASCII);
    System.out.println(text);
}
public static byte[] hexStringToByteArray(String s) {
    int len = s.length();
    byte[] data = new byte[len / 2];
    for (int i = 0; i < len; i += 2) {
        data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
                             + Character.digit(s.charAt(i+1), 16));
    }
    return data;
}

注意:hexStringToByteArray来自Convert a string representation of a hex dump to a byte array using Java?

输出

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs18 hello\par  }

现在 是RTF。

至于解析RTF和提取文本,请参阅Java RTF Parser或进行网络搜索。

答案 1 :(得分:0)

您是否尝试过http://msdn.microsoft.com/en-us/library/cc488002.aspx

中的此解决方案
class ConvertFromRTF
{
    static void Main()
    {

        string path = @"test.rtf";

        //Create the RichTextBox. (Requires a reference to System.Windows.Forms.dll.)
        System.Windows.Forms.RichTextBox rtBox = new System.Windows.Forms.RichTextBox();

        // Get the contents of the RTF file. Note that when it is
        // stored in the string, it is encoded as UTF-16.
        string s = System.IO.File.ReadAllText(path);

        // Display the RTF text.
        System.Windows.Forms.MessageBox.Show(s);

        // Convert the RTF to plain text.
        rtBox.Rtf = s;
        string plainText = rtBox.Text;

        // Display plain text output in MessageBox because console
        // cannot display Greek letters.
        System.Windows.Forms.MessageBox.Show(plainText);

        // Output plain text to file, encoded as UTF-8.
        System.IO.File.WriteAllText(@"output.txt", plainText);
    }
}