我正在使用Visual Studio的Microsoft.Office.Interop.Word.Document
库读取一个word文件。问题是该文件包含特殊字符,如ρ,λ。当我在C#中读取时,它们被转换为?问号。
例如,我正在读一条像
A child drinks a liquid of density ρ through a vertical straw.
所以这一行转换为A child drinks a liquid of density ? through a vertical straw.
所以请帮助我如何以原始形式保存它们。
这是代码
public void ReadMsWord()
{
// variable to store file path
string filePath = null;
// open dialog box to select file
OpenFileDialog file = new OpenFileDialog();
// dilog box title name
file.Title = "Word File";
// set initial directory of computer system
file.InitialDirectory = "c:\\";
// set restore directory
file.RestoreDirectory = true;
// execute if block when dialog result box click ok button
if (file.ShowDialog() == DialogResult.OK)
{
// store selected file path
filePath = file.FileName.ToString();
}
Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.ApplicationClass();
// create object of missing value
object miss = System.Reflection.Missing.Value;
// create object of selected file path
object path = filePath;
// set file path mode
object readOnly = false;
// open document
Microsoft.Office.Interop.Word.Document docs = word.Documents.Open(ref path, ref
miss, ref readOnly, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss,
ref miss, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss);
try
{
// create word application
// select whole data from active window document
docs.ActiveWindow.Selection.WholeStory();
// handover the data to cllipboard
docs.ActiveWindow.Selection.Copy();
// clipboard create reference of idataobject interface which transfer the
data
IDataObject data = Clipboard.GetDataObject();
//set data into richtextbox control in text format
string t = "";
string[] y = {};
t = data.GetData(DataFormats.Text).ToString();
string[] options = { };
y = t.Split('\n');
}
catch(Exception ex)
{
throw ex;
}
}
答案 0 :(得分:2)
使用
t = data.GetData(DataFormats.UnicodeText).ToString();
即。 UnicodeText
代替Text
。请注意,特殊字符仍会在控制台窗口中显示为?
,但它们会在例如正确显示MessageBox.Show或调试器。