使用Open XML WordprocessingDocument删除MS Word文档部分

时间:2012-11-21 09:43:18

标签: text openxml

我正在使用C#和OpenXml DLL来修改现有的MS Word文档。我成功地能够替换文档中的一些标签,然后保存修改,但我还不能删除部分文本。

例如,我的文档有很多标题(Heading1文本样式),后跟正文文本,我想以编程方式删除给定标题和所有以下文本,直到下一个标题。

示例原始文档:

Heading 1 Body text 1 ... ...

Heading 2 Body text 2 ... ...

Heading 3 Body text 3 ... ...

如果用户想要删除标题2,则输出文档应为:

Heading 1 Body text 1 ... ...

Heading 3 Body text 3 ... ...

我是否正确地做到了这一点?有没有人知道怎么做?

2 个答案:

答案 0 :(得分:2)

这取决于数据(段落)的组织方式。

如果标题和段落彼此相邻,只需循环显示段落,找到带标题的段落并删除下一段。

bool remove = false;

foreach(Paragraph p in body.Descendants<Paragraph>()) {

    if (remove)
    {
        p.Remove();
        remove = !remove;
        continue;
    }

    if(p.InnerText.Contains("Heading 2")) {

        p.Remove();
        remove = !remove;

    }

}

答案 1 :(得分:1)

我包含用于解决问题的代码:

        List<OpenXmlElement> ElementsToDeleteList = new List<OpenXmlElement>();
        bool IsParagraphsToDelete = false;
        ...
        // Execute headings removal
        using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(sOutputFileName, true))
        {
            foreach (OpenXmlElement element in wordDoc.MainDocumentPart.RootElement.Descendants())
            {
                if (element.GetType() == typeof(Paragraph))
                {
                    Paragraph paragraph = (Paragraph)element;
                    if (paragraph.ParagraphProperties != null && paragraph.ParagraphProperties.ParagraphStyleId != null &&
                        paragraph.ParagraphProperties.ParagraphStyleId.Val != null && paragraph.ParagraphProperties.ParagraphStyleId.Val.Value != null)
                    {
                        if (paragraph.ParagraphProperties.ParagraphStyleId.Val.Value.ToLower().Contains(MainHeaderStyleName.ToLower()) ||
                            paragraph.ParagraphProperties.ParagraphStyleId.Val.Value.ToLower().Contains(SecondaryHeaderStyleName.ToLower()))
                        {
                            StringBuilder sb = new StringBuilder();
                            foreach (var run in paragraph.Elements<Run>())
                                sb.Append(run.InnerText);

                            string ChapterTitle = sb.ToString().Trim().ToUpper();
                            IsParagraphsToDelete = ListOfDocumentTests.Where(x => x.Title.ToUpper().Trim() == ChapterTitle && x.IsIncluded == false).FirstOrDefault() != null;

                            if (string.IsNullOrEmpty(ChapterTitle) && !IsParagraphsToDelete)
                                ElementsToDeleteList.Add(paragraph);
                        }
                    }
                }

                if (IsParagraphsToDelete && (element.GetType() == typeof(Paragraph) || element.GetType() == typeof(Table)))
                {
                    ElementsToDeleteList.Add(element);
                }

            }

            foreach (OpenXmlElement elemToDelete in ElementsToDeleteList)
            {
                elemToDelete.RemoveAllChildren();
                elemToDelete.Remove();
            }


            wordDoc.MainDocumentPart.Document.Save();

        }