我已经获得了一个相当大的excel文件,每行包含一个来自我们的oracle数据库的clob转储,其中一个可能如下所示:
{\rtf1\ansi\deff0\deftab708{\fonttbl{\f0\fnil\fcharset0 Courier New;}{\f1\fnil\fcharset0 Arial;}{\f2\fnil\fcharset0 MS Sans Serif;}{\f3\fnil\fcharset0 Times New Roman;}{\f4\fnil\fcharset238 Times New Roman CE;}{\f5\fnil\fcharset204 Times New Roman Cyr;}{\f6\fnil\fcharset161 Times New Roman Greek;}{\f7\fnil\fcharset162 Times New Roman Tur;}{\f8\fnil\fcharset186 Times New Roman Baltic;}}{\colortbl\red0\green0\blue0;\red255\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;\red0\green255\blue0;\red255\green0\blue255;\red128\green0\blue128;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;\red128\green128\blue0;\red128\green0\blue0;\red128\green128\blue128;\red255\green255\blue255;}\paperw11906\paperh16838\margl1417\margr1417\margt1417\margb1417{\*\pnseclvl1\pnucrm\pnstart1\pnhang\pnindent720{\pntxtb}{\pntxta{.}}}{\*\pnseclvl2\pnucltr\pnstart1\pnhang\pnindent720{\pntxtb}{\pntxta{.}}}{\*\pnseclvl3\pndec\pnstart1\pnhang\pnindent720{\pntxtb}{\pntxta{.}}}{\*\pnseclvl4\pnlcltr\pnstart1\pnhang\pnindent720{\pntxtb}{\pntxta{)}}}{\*\pnseclvl5\pndec\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}{\*\pnseclvl6\pnlcltr\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}{\*\pnseclvl7\pnlcrm\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}{\*\pnseclvl8\pnlcltr\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}{\*\pnseclvl9\pnlcrm\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}{\pard\ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f3\fs24\cf0 FOO FOO FOO \'85\'85. \'85\'85..}}
现在,通过将此数据放入System.Windows.Forms.RichTextBox
的{{1}}然后读出其.Rtf
值,我得到一个简单的转换。但是,它以某种方式带来了它的新线。
我尝试通过
删除它们 .Text
但它似乎没有帮助。
有谁知道如何将富文本格式转换为单行 纯文本?
答案 0 :(得分:7)
看看这个example,提取的代码用于保存。
更新 - 从VB.NET程序中复制并粘贴错误 - 对不起大家。
class ConvertFromRTF
{
static void Main()
{
string path = @"test.rtf";
//Create the RichTextBox. (Requires a reference to System.Windows.Forms.dll.)
using(System.Windows.Forms.RichTextBox rtBox = new System.Windows.Forms.RichTextBox());
{
// Get the contents of the RTF file. Note that when it is
// stored in the string, it is encoded as UTF-16.
string s = System.IO.File.ReadAllText(path);
// Convert the RTF to plain text.
rtBox.Rtf = s;
string plainText = rtBox.Text;
// Now just remove the new line constants
plainText = plainText.Replace("\r\n", ",");
// Output plain text to file, encoded as UTF-8.
System.IO.File.WriteAllText(@"output.txt", plainText);
}
}
}
答案 1 :(得分:1)
<强> How to: Convert RTF to Plain Text (C# Programming Guide) 强>
在.NET Framework中,您可以使用RichTextBox控件创建支持RTF的文字处理程序,并允许用户以所见即所得的方式将格式应用于文本。
您还可以使用RichTextBox控件以编程方式从文档中删除RTF格式代码并将其转换为纯文本。您无需在Windows窗体中嵌入控件即可执行此类操作。