我正在尝试从XML PList(Apple System Profiler)文件中提取数据,并将其读入内存数据库,最后我想将其转换为人类可读的内容。
问题在于格式似乎很难以一致的方式阅读。我已经解决了一些解决方案,但我还没有找到解决方案,我觉得很满意。我总是不得不对很多值进行硬编码,最终需要很多if-else/switch statements
。
格式如下所示。
<plist>
<key>_system</key>
<array>
<dict>
<key>_cpu_type</key>
<string>Intel Core Duo</string>
</dict>
</array>
</plist>
示例文件here。
在我阅读之后(或在阅读期间),我使用了一个内部字典来确定它是什么类型的信息。例如,如果密钥是cpu_type
,我会相应地保存信息。
我尝试过的一些例子(simplified
)来提取信息。
XmlTextReader reader = new
XmlTextReader("C:\\test.spx");
reader.XmlResolver = null;
reader.ReadStartElement("plist");
String key = String.Empty; String str
= String.Empty;
Int32 Index = 0;
while (reader.Read()) {
if (reader.LocalName == "key")
{
Index++;
key = reader.ReadString();
}
else if (reader.LocalName == "string")
{
str = reader.ReadString();
if (key != String.Empty)
{
dct.Add(Index, new KeyPair(key, str));
key = String.Empty;
}
}
}
或类似的东西。
foreach (var d in xdoc.Root.Elements("plist"))
dict.Add(d.Element("key").Value,> d.Element("string").Value);
我找到了一个框架,我可以修改here。
更有用的信息
Mac OS X系统探查器here上的信息。
Apple脚本用于解析XML文件here。
对此有任何建议或见解将深表感谢。
答案 0 :(得分:9)
我首先想到的只是使用XSLT(XSL转换)。根据你在上述评论中的答案,我不知道你正在寻找什么样的格式,但我认为我至少得到了要点。除非你需要一些特别的东西,我没想到,我相信XSLT足够强大,可以完成你需要的一切,而且不需要一堆复杂的循环结构。
如果你不熟悉,那么在w3schools上有很多关于XSLT的好信息(可能从介绍:http://www.w3schools.com/xsl/xsl_intro.asp开始),维基百科也有很好的写作(http://en.wikipedia.org/wiki/XSLT)。
总是需要一段时间才能按照我想要的方式运行规则;这是一种不同的思考这种转变的方式,让我有些习惯。有必要对XPATH有一个正确的理解。我经常不得不同时参考XSLT规范(http://www.w3.org/TR/xslt)和XPATH规范(http://www.w3.org/TR/xpath/),因为我只有一点经验,可能一旦你使用过它一段时间它会更顺利。
无论如何,我有一个我以前写过的应用程序来玩这些翻译。它是一个带有三个文本框的C#应用程序:一个用于XSLT,一个用于源,一个用于输出。我花了一些时间(好吧,很多)试图获得第一次处理样本数据的XSLT,以了解它有多难以及变换的结构是什么。我想我终于想出了所需要的东西,但由于我不确切知道你需要什么样的格式,我就到了那里。
这是示例转换输出的链接:http://pastebin.com/SMFxUdDK。
以下是实际进行转换的所有代码,包含在您可以用来开发的表单中。它并不华丽,但对我来说效果很好。 “繁重的提升”都是在“btnTransform_Click()”处理程序中完成的,而且我已经实现了一个XmlStringWriter,以便按照我想要的方式输出内容。这里工作的主要部分就是提出XSLT指令,在.NET XslCompiledTransform类中,您可以很好地处理实际的转换。然而,我觉得我花了足够的时间来弄清楚它上面的所有细节,当我写它时值得给出一个有效的例子......
请注意我在运行中更改了几个命名空间,并且还在XSLT中添加了一些轻松的注释,所以如果有问题请告诉我,我会更正它们。
所以,没有进一步的说法:;)
XSLT文件:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="msxsl"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:fn="http://www.w3.org/2005/xpath-functions"
>
<!-- this just says to output XML as opposed to HTML or raw text -->
<xsl:output method="xml" indent="yes" xsi:type="xsl:output" />
<!-- this matches the root element and then creates a root element -->
<!-- with more templates applied as children -->
<xsl:template match="/" priority="9" >
<xsl:element name="root" xmlns="http://www.tempuri.org/plist">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<!-- wasn't sure how you would want the dict and arrays handled -->
<!-- for a final cut, so i just make them into parent nodes of -->
<!-- the data underneath them, and then apply the templates -->
<xsl:template match="dict" priority="3" >
<xsl:element name="dictionary" xmlns="http://www.tempuri.org/plist">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<xsl:template match="array" priority="5" >
<xsl:element name="list" xmlns="http://www.tempuri.org/plist">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<!-- actually, figuring the following step out is what hung me up; the -->
<!-- issue here is that i'm taking the text out of the string/integer/date -->
<!-- nodes and putting them into elements named after the 'key' nodes -->
<!-- because of this, you actually have to have the template match the -->
<!-- nodes you will be consuming and then just using the conditional -->
<!-- to only process the 'key' nodes. also, there were a couple of -->
<!-- stray characters in the source XML; i think it was an encoding -->
<!-- issue, so i just stripped them out with the "translate" call when -->
<!-- creating the keyName variable. since those were the only two -->
<!-- and because they looked to be strays, i did not worry about it -->
<!-- further. the only reason it is an issue is because i was -->
<!-- creating elements out of the contents of the keys, and key names -->
<!-- are restricted in what characters they can use. -->
<xsl:template match="key|string|integer|date" priority="1" >
<xsl:if test="local-name(self::node())='key'">
<xsl:variable name="keyName" select="translate(child::text(),' €™','---')" />
<xsl:element name="{$keyName}" xmlns="http://www.tempuri.org/plist" >
<!-- removed on-the-fly; i had put this in while testing
<xsl:if test="local-name(following-sibling::node())='string'">
-->
<xsl:value-of select="following-sibling::node()" />
<!--
</xsl:if>
-->
</xsl:element>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
我做的一个小帮助班(XmlStringWriter.cs
):
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
namespace XSLTTest.Xml
{
public class XmlStringWriter :
XmlWriter
{
public static XmlStringWriter Create(XmlWriterSettings Settings)
{
return new XmlStringWriter(Settings);
}
public static XmlStringWriter Create()
{
return XmlStringWriter.Create(XmlStringWriter.XmlWriterSettings_display);
}
public static XmlWriterSettings XmlWriterSettings_display
{
get
{
XmlWriterSettings XWS = new XmlWriterSettings();
XWS.OmitXmlDeclaration = false; // make a choice?
XWS.NewLineHandling = NewLineHandling.Replace;
XWS.NewLineOnAttributes = false;
XWS.Indent = true;
XWS.IndentChars = "\t";
XWS.NewLineChars = Environment.NewLine;
//XWS.ConformanceLevel = ConformanceLevel.Fragment;
XWS.CloseOutput = false;
return XWS;
}
}
public override string ToString()
{
return myXMLStringBuilder.ToString();
}
//public static implicit operator XmlWriter(XmlStringWriter Me)
//{
// return Me.myXMLWriter;
//}
//--------------
protected StringBuilder myXMLStringBuilder = null;
protected XmlWriter myXMLWriter = null;
protected XmlStringWriter(XmlWriterSettings Settings)
{
myXMLStringBuilder = new StringBuilder();
myXMLWriter = XmlWriter.Create(myXMLStringBuilder, Settings);
}
public override void Close()
{
myXMLWriter.Close();
}
public override void Flush()
{
myXMLWriter.Flush();
}
public override string LookupPrefix(string ns)
{
return myXMLWriter.LookupPrefix(ns);
}
public override void WriteBase64(byte[] buffer, int index, int count)
{
myXMLWriter.WriteBase64(buffer, index, count);
}
public override void WriteCData(string text)
{
myXMLWriter.WriteCData(text);
}
public override void WriteCharEntity(char ch)
{
myXMLWriter.WriteCharEntity(ch);
}
public override void WriteChars(char[] buffer, int index, int count)
{
myXMLWriter.WriteChars(buffer, index, count);
}
public override void WriteComment(string text)
{
myXMLWriter.WriteComment(text);
}
public override void WriteDocType(string name, string pubid, string sysid, string subset)
{
myXMLWriter.WriteDocType(name, pubid, sysid, subset);
}
public override void WriteEndAttribute()
{
myXMLWriter.WriteEndAttribute();
}
public override void WriteEndDocument()
{
myXMLWriter.WriteEndDocument();
}
public override void WriteEndElement()
{
myXMLWriter.WriteEndElement();
}
public override void WriteEntityRef(string name)
{
myXMLWriter.WriteEntityRef(name);
}
public override void WriteFullEndElement()
{
myXMLWriter.WriteFullEndElement();
}
public override void WriteProcessingInstruction(string name, string text)
{
myXMLWriter.WriteProcessingInstruction(name, text);
}
public override void WriteRaw(string data)
{
myXMLWriter.WriteRaw(data);
}
public override void WriteRaw(char[] buffer, int index, int count)
{
myXMLWriter.WriteRaw(buffer, index, count);
}
public override void WriteStartAttribute(string prefix, string localName, string ns)
{
myXMLWriter.WriteStartAttribute(prefix, localName, ns);
}
public override void WriteStartDocument(bool standalone)
{
myXMLWriter.WriteStartDocument(standalone);
}
public override void WriteStartDocument()
{
myXMLWriter.WriteStartDocument();
}
public override void WriteStartElement(string prefix, string localName, string ns)
{
myXMLWriter.WriteStartElement(prefix, localName, ns);
}
public override WriteState WriteState
{
get
{
return myXMLWriter.WriteState;
}
}
public override void WriteString(string text)
{
myXMLWriter.WriteString(text);
}
public override void WriteSurrogateCharEntity(char lowChar, char highChar)
{
myXMLWriter.WriteSurrogateCharEntity(lowChar, highChar);
}
public override void WriteWhitespace(string ws)
{
myXMLWriter.WriteWhitespace(ws);
}
}
}
windows表单设计器类(frmXSLTTest.Designer.cs
)
namespace XSLTTest
{
partial class frmXSLTTest
{
/// <summary>
/// Required designer variable.
/// </summary>
private System.ComponentModel.IContainer components = null;
/// <summary>
/// Clean up any resources being used.
/// </summary>
/// <param name="disposing">true if managed resources should be disposed; otherwise, false.</param>
protected override void Dispose(bool disposing)
{
if (disposing && (components != null))
{
components.Dispose();
}
base.Dispose(disposing);
}
#region Windows Form Designer generated code
/// <summary>
/// Required method for Designer support - do not modify
/// the contents of this method with the code editor.
/// </summary>
private void InitializeComponent()
{
this.splitContainer1 = new System.Windows.Forms.SplitContainer();
this.btnTransform = new System.Windows.Forms.Button();
this.groupBox1 = new System.Windows.Forms.GroupBox();
this.txtStylesheet = new System.Windows.Forms.TextBox();
this.splitContainer2 = new System.Windows.Forms.SplitContainer();
this.groupBox2 = new System.Windows.Forms.GroupBox();
this.txtInputXML = new System.Windows.Forms.TextBox();
this.groupBox3 = new System.Windows.Forms.GroupBox();
this.txtOutputXML = new System.Windows.Forms.TextBox();
((System.ComponentModel.ISupportInitialize)(this.splitContainer1)).BeginInit();
this.splitContainer1.Panel1.SuspendLayout();
this.splitContainer1.Panel2.SuspendLayout();
this.splitContainer1.SuspendLayout();
this.groupBox1.SuspendLayout();
((System.ComponentModel.ISupportInitialize)(this.splitContainer2)).BeginInit();
this.splitContainer2.Panel1.SuspendLayout();
this.splitContainer2.Panel2.SuspendLayout();
this.splitContainer2.SuspendLayout();
this.groupBox2.SuspendLayout();
this.groupBox3.SuspendLayout();
this.SuspendLayout();
//
// splitContainer1
//
this.splitContainer1.Dock = System.Windows.Forms.DockStyle.Fill;
this.splitContainer1.Location = new System.Drawing.Point(0, 0);
this.splitContainer1.Name = "splitContainer1";
this.splitContainer1.Orientation = System.Windows.Forms.Orientation.Horizontal;
//
// splitContainer1.Panel1
//
this.splitContainer1.Panel1.Controls.Add(this.btnTransform);
this.splitContainer1.Panel1.Controls.Add(this.groupBox1);
//
// splitContainer1.Panel2
//
this.splitContainer1.Panel2.Controls.Add(this.splitContainer2);
this.splitContainer1.Size = new System.Drawing.Size(788, 363);
this.splitContainer1.SplitterDistance = 194;
this.splitContainer1.TabIndex = 0;
//
// btnTransform
//
this.btnTransform.Anchor = ((System.Windows.Forms.AnchorStyles)((System.Windows.Forms.AnchorStyles.Bottom | System.Windows.Forms.AnchorStyles.Left)));
this.btnTransform.Location = new System.Drawing.Point(6, 167);
this.btnTransform.Name = "btnTransform";
this.btnTransform.Size = new System.Drawing.Size(75, 23);
this.btnTransform.TabIndex = 1;
this.btnTransform.Text = "Transform";
this.btnTransform.UseVisualStyleBackColor = true;
this.btnTransform.Click += new System.EventHandler(this.btnTransform_Click);
//
// groupBox1
//
this.groupBox1.Anchor = ((System.Windows.Forms.AnchorStyles)((((System.Windows.Forms.AnchorStyles.Top | System.Windows.Forms.AnchorStyles.Bottom)
| System.Windows.Forms.AnchorStyles.Left)
| System.Windows.Forms.AnchorStyles.Right)));
this.groupBox1.Controls.Add(this.txtStylesheet);
this.groupBox1.Location = new System.Drawing.Point(3, 3);
this.groupBox1.Name = "groupBox1";
this.groupBox1.Size = new System.Drawing.Size(782, 161);
this.groupBox1.TabIndex = 0;
this.groupBox1.TabStop = false;
this.groupBox1.Text = "Stylesheet";
//
// txtStylesheet
//
this.txtStylesheet.Dock = System.Windows.Forms.DockStyle.Fill;
this.txtStylesheet.Font = new System.Drawing.Font("Lucida Console", 7F, System.Drawing.FontStyle.Regular, System.Drawing.GraphicsUnit.Point, ((byte)(0)));
this.txtStylesheet.Location = new System.Drawing.Point(3, 16);
this.txtStylesheet.MaxLength = 1000000;
this.txtStylesheet.Multiline = true;
this.txtStylesheet.Name = "txtStylesheet";
this.txtStylesheet.ScrollBars = System.Windows.Forms.ScrollBars.Both;
this.txtStylesheet.Size = new System.Drawing.Size(776, 142);
this.txtStylesheet.TabIndex = 0;
//
// splitContainer2
//
this.splitContainer2.Dock = System.Windows.Forms.DockStyle.Fill;
this.splitContainer2.Location = new System.Drawing.Point(0, 0);
this.splitContainer2.Name = "splitContainer2";
//
// splitContainer2.Panel1
//
this.splitContainer2.Panel1.Controls.Add(this.groupBox2);
//
// splitContainer2.Panel2
//
this.splitContainer2.Panel2.Controls.Add(this.groupBox3);
this.splitContainer2.Size = new System.Drawing.Size(788, 165);
this.splitContainer2.SplitterDistance = 395;
this.splitContainer2.TabIndex = 0;
//
// groupBox2
//
this.groupBox2.Controls.Add(this.txtInputXML);
this.groupBox2.Dock = System.Windows.Forms.DockStyle.Fill;
this.groupBox2.Location = new System.Drawing.Point(0, 0);
this.groupBox2.Name = "groupBox2";
this.groupBox2.Size = new System.Drawing.Size(395, 165);
this.groupBox2.TabIndex = 1;
this.groupBox2.TabStop = false;
this.groupBox2.Text = "Input XML";
//
// txtInputXML
//
this.txtInputXML.Dock = System.Windows.Forms.DockStyle.Fill;
this.txtInputXML.Font = new System.Drawing.Font("Lucida Console", 7F, System.Drawing.FontStyle.Regular, System.Drawing.GraphicsUnit.Point, ((byte)(0)));
this.txtInputXML.Location = new System.Drawing.Point(3, 16);
this.txtInputXML.MaxLength = 1000000;
this.txtInputXML.Multiline = true;
this.txtInputXML.Name = "txtInputXML";
this.txtInputXML.ScrollBars = System.Windows.Forms.ScrollBars.Both;
this.txtInputXML.Size = new System.Drawing.Size(389, 146);
this.txtInputXML.TabIndex = 1;
//
// groupBox3
//
this.groupBox3.Controls.Add(this.txtOutputXML);
this.groupBox3.Dock = System.Windows.Forms.DockStyle.Fill;
this.groupBox3.Location = new System.Drawing.Point(0, 0);
this.groupBox3.Name = "groupBox3";
this.groupBox3.Size = new System.Drawing.Size(389, 165);
this.groupBox3.TabIndex = 1;
this.groupBox3.TabStop = false;
this.groupBox3.Text = "Output XML";
//
// txtOutputXML
//
this.txtOutputXML.Dock = System.Windows.Forms.DockStyle.Fill;
this.txtOutputXML.Font = new System.Drawing.Font("Lucida Console", 7F, System.Drawing.FontStyle.Regular, System.Drawing.GraphicsUnit.Point, ((byte)(0)));
this.txtOutputXML.Location = new System.Drawing.Point(3, 16);
this.txtOutputXML.MaxLength = 1000000;
this.txtOutputXML.Multiline = true;
this.txtOutputXML.Name = "txtOutputXML";
this.txtOutputXML.ScrollBars = System.Windows.Forms.ScrollBars.Both;
this.txtOutputXML.Size = new System.Drawing.Size(383, 146);
this.txtOutputXML.TabIndex = 1;
//
// frmXSLTTest
//
this.AutoScaleDimensions = new System.Drawing.SizeF(6F, 13F);
this.AutoScaleMode = System.Windows.Forms.AutoScaleMode.Font;
this.ClientSize = new System.Drawing.Size(788, 363);
this.Controls.Add(this.splitContainer1);
this.Name = "frmXSLTTest";
this.Text = "frmXSLTTest";
this.splitContainer1.Panel1.ResumeLayout(false);
this.splitContainer1.Panel2.ResumeLayout(false);
((System.ComponentModel.ISupportInitialize)(this.splitContainer1)).EndInit();
this.splitContainer1.ResumeLayout(false);
this.groupBox1.ResumeLayout(false);
this.groupBox1.PerformLayout();
this.splitContainer2.Panel1.ResumeLayout(false);
this.splitContainer2.Panel2.ResumeLayout(false);
((System.ComponentModel.ISupportInitialize)(this.splitContainer2)).EndInit();
this.splitContainer2.ResumeLayout(false);
this.groupBox2.ResumeLayout(false);
this.groupBox2.PerformLayout();
this.groupBox3.ResumeLayout(false);
this.groupBox3.PerformLayout();
this.ResumeLayout(false);
}
#endregion
private System.Windows.Forms.SplitContainer splitContainer1;
private System.Windows.Forms.Button btnTransform;
private System.Windows.Forms.GroupBox groupBox1;
private System.Windows.Forms.TextBox txtStylesheet;
private System.Windows.Forms.SplitContainer splitContainer2;
private System.Windows.Forms.GroupBox groupBox2;
private System.Windows.Forms.GroupBox groupBox3;
private System.Windows.Forms.TextBox txtInputXML;
private System.Windows.Forms.TextBox txtOutputXML;
}
}
表单类(frmXSLTTest.cs
):
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Xml;
using System.Xml.Xsl;
using XSLTTest.Xml;
namespace XSLTTest
{
public partial class frmXSLTTest : Form
{
public frmXSLTTest()
{
InitializeComponent();
}
private void btnTransform_Click(object sender, EventArgs e)
{
try
{
// temporary to copy from clipboard when pressing
// the button instead of using the text in the textbox
//txtStylesheet.Text = Clipboard.GetText();
XmlDocument Stylesheet = new XmlDocument();
Stylesheet.InnerXml = txtStylesheet.Text;
XslCompiledTransform XCT = new XslCompiledTransform(true);
XCT.Load(Stylesheet);
XmlDocument InputDocument = new XmlDocument();
InputDocument.InnerXml = txtInputXML.Text;
XmlStringWriter OutputWriter = XmlStringWriter.Create();
XCT.Transform(InputDocument, OutputWriter);
txtOutputXML.Text = OutputWriter.ToString();
}
catch (Exception Ex)
{
txtOutputXML.Text = Ex.Message;
}
}
}
}