我需要读取msg文件的主体并将其转换为xml文件。我使用下面的代码将msg文件转换为xml文件。 我能够获取xml文件,但问题是输出xml文件中显示空行。 我使用 RegEx 从字符串中删除空行。我可以看到在调试时从字符串中删除了空白行。但是在将该字符串作为xml文件加载后,我在xml文件中得到空白行。附上了样本xml文件的图像。
string[] filePaths = Directory.GetFiles(@"C:\Projects\Userdata\Source Folder\", "*.msg");
for (int i = 0; i < filePaths.Length; ++i)
{
string path = filePaths[i];
string fname = System.IO.Path.GetFileName(path);
_Application outlook = new ApplicationClass();
MailItem item = (MailItem)outlook.CreateItemFromTemplate(path, Type.Missing);
string b = item.Body;
string formatbody = System.Text.RegularExpressions.Regex.Replace(b, @"^\s+$[\r\n]*", "", RegexOptions.Multiline);
XDocument doc1 = XDocument.Parse(formatbody,LoadOptions.PreserveWhitespace);
var xs = doc1.Elements();
string test = string.Empty;
foreach (var x in xs)
{
test += x.ToString();
}
XmlDocument doc = new XmlDocument();
doc.LoadXml(test);
doc.Save(@"C:\Projects\Destination Folder\" + fname + ".xml");
}
.msg文件的正文如下所示
<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet type="text/xsl" href="gateway_transaction_display.xsl"?>
<File>
<File_Type>AP PAYMENTS</File_Type>
<File_Header_Record>
<File_Format_Version>0002</File_Format_Version>
<Creation_Module>0286-14</Creation_Module>
</File_Header_Record>
<Transaction>
<Transaction_Type>FT_TRANS_IMP</Transaction_Type>
<Transaction_Header>
<Record_Number>1</Record_Number>
<Urgent>Y</Urgent>
</Transaction_Header>
<Model_Info>
<Model_ID><![CDATA[FF DOM INT PAY]]></Model_ID>
</Model_Info>
<Transfer_Info>
<Charges>15</Charges>
</Transfer_Info>
<Amounts>
<Transaction_Amount>
<Amount>4665786.22</Amount>
<Currency>CAD</Currency>
</Transaction_Amount>
</Amounts>
<Dates>
<Trusted_Source>Y</Trusted_Source>
<Value_Date>2014-03-31</Value_Date>
</Dates>
<Bank_Account>
<Bank_Account_Type>DR</Bank_Account_Type>
<Bank>
<Bank_Route_Code>
<Code_Type>Y</Code_Type>
</Bank_Route_Code>
</Bank>
<Account>
<Account_ID>FF01</Account_ID>
</Account>
</Bank_Account>
<Bank_Account>
<Bank_Account_Type>CR</Bank_Account_Type>
<Bank>
<Bank_Route_Code>
<Code_Type>Y</Code_Type>
</Bank_Route_Code>
</Bank>
<Account>
<Account_ID>D039</Account_ID>
</Account>
</Bank_Account>
<Payment_Details_Or_Addenda>
<Details_Text><![CDATA[Unapplied
cash & intercompany settlemet]]></Details_Text>
</Payment_Details_Or_Addenda>
</Transaction>
<File_Trailer_Record>
<File_Name>AP PAYMENTS</File_Name>
</File_Trailer_Record>
</File>
答案 0 :(得分:2)
您不需要使用Regex删除空格。相反
1.在解析为XDocument之前修剪消息内容
string result = item.Body.Trim()
2.将loadoptions指定为none而不是PreserveWhitespace。
XDocument.Parse(result,LoadOptions.None);
- SJ