好的,所以这是一个有趣的...
我正在编写一个简单的文件分析器,它试图识别不同类型的文件。我已经获得了一大堆用于测试的示例文件。
我不知道“RSS”是什么,但我能找到的所有内容都声称它是基于XML的。但是,我有一大堆*.rss
个文件,它们对我来说看起来不像XML:
X-MS-FeedTitle: Microsoft At Work
From: "Microsoft at Work"
Subject: Keep yourself organized with Microsoft Outlook Tasks
Date: Mon, 27 Jun 2011 00:00:00 -0700
Message-ID: 00000026
MIME-Version: 1.0
Content-Type: text/html;
charset="UTF-8"
Content-Transfer-Encoding: base64
X-MS-ItemUrl: http://www.microsoft.com/atwork/productivity/streamline.aspx?WT.rss_f=At Work RSS&WT.rss_a=Keep yourself organized with Microsoft Outlook Tasks&WT.rss_ev=a
X-MimeOLE: Produced By Microsoft MimeOLE V14.0.8117.416
77u/PEhUTUw+PEhFQUQ+PE1FVEEgaHR0cC1lcXVpdj1Db250ZW50LVR5cGUgY29udGVudD0idGV4
dC9odG1sOyBjaGFyc2V0PXV0Zi04Ij48U1RZTEU+Qk9EWSB7Zm9udC1mYW1pbHk6IEFyaWFsO2Zv
bnQtc2l6ZTogMTBwdDt9PC9TVFlMRT48L0hFQUQ+PEJPRFk+PGJyPlRoZXNlIHNpeCBNaWNyb3Nv
ZnQgT3V0bG9vayBUYXNrcyB0aXBzIHdpbGwgaGVscCB5b3Ugc3RheSBvbmUgc3RlcCBhaGVhZCBv
ZiB0aGUgY29tcGV0aXRpb24uPC9CT0RZPjwvSFRNTD4=
这不是XML。它看起来更像某种协议头,后面是base64编码的有效载荷。
是什么这个东西?它看起来不像我期待的......
编辑:以下是解码base64块的结果:
<HTML><HEAD><META http-equiv=Content-Type content="text/html; charset=utf-8"><STYLE>BODY {font-family: Arial;font-size: 10pt;}</STYLE></HEAD><BODY><br>These six Microsoft Outlook Tasks tips will help you stay one step ahead of the competition.</BODY></HTML>