xml sdk 2.0来读取Office 10文件的详细信息/属性

时间:2011-05-25 06:49:20

标签: c# xml file sdk details

我需要从新的Office文件(.docx,.xlsx)中读取文件详细信息,尤其是作者,标题,主题。 我从MS找到了这篇文章,它也有一些方法 - http://msdn.microsoft.com/en-us/library/bb739835%28v=office.12%29.aspx 但我似乎可以做到这一点。 我正在使用的方法是:

public static string WDRetrieveCoreProperty(string docName, string propertyName)
{
   // Given a document name and a core property, retrieve the value of the property.
   // Note that because this code uses the SelectSingleNode method, 
   // the search is case sensitive. That is, looking for "Author" is not 
   // the same as looking for "author".

   const string corePropertiesSchema = "http://schemas.openxmlformats.org/package/2006/metadata/core-properties";
   const string dcPropertiesSchema = "http://purl.org/dc/elements/1.1/";
   const string dcTermsPropertiesSchema = "http://purl.org/dc/terms/";

   string propertyValue = string.Empty;

   using (WordprocessingDocument wdPackage = WordprocessingDocument.Open(docName, true))
   {
      // Get the core properties part (core.xml).
      CoreFilePropertiesPart corePropertiesPart = wdPackage.CoreFilePropertiesPart;

      // Manage namespaces to perform XML XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("cp", corePropertiesSchema);
      nsManager.AddNamespace("dc", dcPropertiesSchema);
      nsManager.AddNamespace("dcterms", dcTermsPropertiesSchema);

      // Get the properties from the package.
      XmlDocument xdoc = new XmlDocument(nt);

      // Load the XML in the part into an XmlDocument instance.
      xdoc.Load(corePropertiesPart.GetStream());

      string searchString = string.Format("//cp:coreProperties/{0}", propertyName);

      XmlNode xNode = xdoc.SelectSingleNode(searchString, nsManager);
      if (!(xNode == null))
      {
         propertyValue = xNode.InnerText;
      }
   }

   return propertyValue;
}

所以我把这个方法称为:

WDRetrieveCoreProperty(textBox1.Text, "Authors"); 
// textBox1 has path to some .docx file

但它总是返回null。那么这有什么问题呢?

2 个答案:

答案 0 :(得分:2)

我知道这个问题已经过时了,但是在研究同一个问题时遇到了这个问题。 MSDN上的示例包含检索核心属性的方法的示例代码,但没有使用该方法的示例。

传递属性时,发现必须包含命名空间前缀。因此,使用OP方法访问lastModifiedBy核心属性如下所示:

WDRetrieveCoreProperty(textBox1.Text, "cp:lastModifiedBy");

答案 1 :(得分:0)

我这样做了......

using System.IO.Packaging; // Assembly WindowsBase.dll
  :
     static void Main(string[] args)
     {
        String path = Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData);
        String file = Path.Combine(path, "Doc1.docx");

        Package docx = Package.Open(file, FileMode.Open, FileAccess.Read);
        String subject = docx.PackageProperties.Subject;
        String title = docx.PackageProperties.Title;
        docx.Close();
     }