使用Open XML SDK 2.0或Linq删除Word 2010中的自定义属性的内容

时间:2011-01-18 11:18:43

标签: c# linq ms-word openxml-sdk openxml

我正在尝试从word文件中删除敏感信息,然后再从我们的系统发送。 下面是将要发送的文件中的自定义属性的示例。我想删除filePath和templateFilePath的内容。

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties" xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes">
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="2" name="docId">
        <vt:lpwstr>123</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="3" name="verId">
        <vt:lpwstr>1</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="4" name="templateId">
        <vt:lpwstr>321</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="5" name="fileId">
        <vt:lpwstr>123</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="6" name="filePath">
        <vt:lpwstr>I want to remove this</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="7" name="templateFilePath">
        <vt:lpwstr>I want to remove this</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="8" name="filePathOneNote">
        <vt:lpwstr>\</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="9" name="fileName">
        <vt:lpwstr>test.docx</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="10" name="comment">
        <vt:lpwstr>Test comment</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="11" name="sourceId">
        <vt:lpwstr>12345</vt:lpwstr>
    </property>
    <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="12" name="module">
        <vt:lpwstr>Document</vt:lpwstr>
    </property>
</Properties>

我从Open XML SDK生产力工具中获得了此代码:

private static void ChangeCustomFilePropertiesPart(CustomFilePropertiesPart customFilePropertiesPart)
{
    CustomProperties.Properties properties = customFilePropertiesPart.Properties;

    CustomProperties.CustomDocumentProperty customDocumentProperty1 = properties.Elements<CustomProperties.CustomDocumentProperty>().ElementAt(4);
    CustomProperties.CustomDocumentProperty customDocumentProperty2 = properties.Elements<CustomProperties.CustomDocumentProperty>().ElementAt(5);

    VariantTypes.VTLPWSTR vTLPWSTR1 = customDocumentProperty1.GetFirstChild<VariantTypes.VTLPWSTR>();
    vTLPWSTR1.Text = "";


    VariantTypes.VTLPWSTR vTLPWSTR2 = customDocumentProperty2.GetFirstChild<VariantTypes.VTLPWSTR>();
    vTLPWSTR2.Text = "";

}

但我不能相信我要删除的属性是第四和第五,所以我必须在删除文本之前通过name属性找到它们。谁能帮我?我想以某种方式使用linq或Open XML SDK。

谢谢!

2 个答案:

答案 0 :(得分:2)

您不必按@pid进行查询,因为这可能会发生变化。相反,@name查询,自定义文档属性的值始终保持不变;因此,在您的情况下,只需使用Lambda查询Where @name =“templateFilePath”,然后将其.Value设置为空,然后将其复制并保存。

答案 1 :(得分:0)

以下是我提出的代码:

private void ChangeCustomFilePropertiesPart(CustomFilePropertiesPart customFilePropertiesPart)
{
    var props = from n in customFilePropertiesPart.Properties.Elements<CustomProperties.CustomDocumentProperty>()
                where n.Name == "filePath" || n.Name == "templateFilePath"
                select n;

    foreach (var prop in props)
    {
        VariantTypes.VTLPWSTR value = prop.GetFirstChild<VariantTypes.VTLPWSTR>();
        value.Text = "";
    }
}